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Title: Process for producing fusion proteins comprising ScFv fragments by a 
transformed mould 

The present invention relates to the production of a Single Chain antibody 
5 fragment (ScFv fragment) by a transformed mould. In this specification an ScFv 
fragment stands for a variable fragment of a heavy chain connected by a linker 
peptide to a variable fragment of a light chain. 

Background of the invention 

10 It has been described that ScFv fragments can be produced in various transformed 
microorganisms, but with various degrees of success. For example, from WO 
93/02198 (TECH. RES. INST. FINLAND; Teeri c.s.) published 04.02.93 it is 
known that ScFv fragments can be produced and secreted in several host organisms 
(although it is only exemplified in E. coli and 5. cerevisiae), provided that a special 

15 linker is used between the heavy chain and the light chain fragments. That linker 
comprises a flexible hinge region of a naturally secreted multidomain protein or an 
analogue thereof not being homologous to either of the heavy or light chain 
fragments. This WO 93/02198 is incorporated herein by reference. A serious 
limitation of the method disclosed in WO 93/02198 is the low production level 

20 shown, which is far below the production level required for the application of ScFv 
fragments in consumer products at a reasonable price. Examples of such consumer 
products include detergent products, food products, and products for the personal 
care of people like toilet soap and under arm hygienic products. Thus there is a 
need for a more universal high-yielding production system for ScFv fragments. 

25 The production of an ScFv fragment in E. coli bacteria gives relatively low yields 
and there is a need for solubilization and subsequent renaturation of the proteins 
formed inside the bacteria, which makes this method not attractive for production 
of antibody fragments that need be used in relatively large amounts (see page 3, 
lines 5-23 of WO 93/02198). When attempting to produce various ScFv fragments 

30 in yeasts using expression systems, that have produced various heterologous 

enzymes in amounts sufficient for economical application in consumer goods, the 
present inventors found that the ScFv fragments were not secreted or only in very 
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minute quantities. This appears to be in agreement with Example 2 on pages 29-31 
of WO 93/02198 which relates to the production of an ScFv fragment in yeast 
without indicating the amount produced . Although in WO 93/02198 many 
alternative linkers are mentioned, it is stated on page 6 of WO 93/02198 that 
5 "... there are no published reports of the analysis or design of secretable linker 
peptides." and "... there are no published examples to date of novel fusion proteins 
with added heterologous linker sequences which are secreted to the culture 
medium of the host." 

10 In another recent publication, namely in WO 92/01797 (OY ALKO AB), published 
06.02.92, the production of immunoglobulins in the mould Trichodenna is 
described. In Example 20 on pages 83-85 and Figure 27 the construction and 
expression of a functional gene encoding a single chain antibody containing 
variable regions of both a light and heavy chain linked to each other by a flexible 

15 hinge region of CBHI is described (CBHI is cellobiohydrolase I present in large 
amounts in the culture medium of Trichoderma reesei; see page 3 of WO 
92/01797). The gene was under control of a T reesei cbhi terminator and either a 
7. reesei cbhi promoter (plasmid pEN401) or an Aspergillus gpd promoter (plasmid 
pEN402). The plasmids were transformed to Trichodenna reesei strain RUT-C-30 

20 (ATCC 56765) and the transformants were grown in two different media. 
Expression of immunoreactive single chain antibodies was tested from culture 
supernatants but no results were mentioned . Thus it was not demonstrated that 
any amount of single chain antibodies was actually formed. This conclusion is in 
agreement with a later related publication of Nyyssonen et aL ex VTT Biotechnical 

25 Laboratory, Finland (1993) in which partially the same experiments are described 
with plasmids pEN304, pAJ202 and pEN209 encoding the 23.3 kD light chain, the 
23.9 kD heavy Fd chain and the 73.2 kD CBHI-heavy Fd chain, respectively, which 
plasmids are also exemplified in WO 92/01797. In this publication only the 
production of a separate light chain or a separate heavy chain, as such or as a 

30 precursor, by a Trichodenna reesei strain is described, but the production of an 
ScFv fragment containing a light chain connected via a linker peptide to a heavy 
chain is not described. 
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Therefore, there is still a need for an alternative production and secretion system 
for ScFv fragments in a mould that gives at least a reasonable yield of the desired 
ScFv fragment The present invention provides such production using a 
transformed mould of the genus Aspergillus. 

5 

According to M. Ward et al (1990), see also GENENCOR's WO 90/15860 
published 27.12.90, the production in Aspergillus of a desired protein and 
subsequent secretion can be improved when a fusion protein comprising the 
desired protein and a mould protein is produced. This was exemplified with the 

10 production of prochymosin fused with its amino terminus to the carboxyl terminus 
of A. awamori glucoamylase. However, that publication does not give any 
suggestion that such an approach would also be suitable for the production of ScFv 
fragments, which are known as compounds presenting great difficulties when one 
attempts to obtain their production and secretion by a microbial host (see the 

15 above mentioned WO 93/02198). 

In UNILEVER'S not prior-published WO 93/12237, now published 24.06.93 and 
claiming a priority date of 09.12.91, a process for the production and secretion of a 
desired protein by a transformed mould is described, in which the expression 

20 and/or secretion regulating regions are derived from the endoxylanase II gene 
(exlA gene) of Aspergillus niger var. awamori present on plasmid pAW14B (see 
Figure 3 of WO 93/12237), which is present in a transformed E. coli strain JM109 
deposited under the Budapest Treaty at the Centraalbureau voor Schimmelcultures 
in Baarn, The Netherlands, as N° CBS 237.90 on 31 May 1990. In a preferred 

25 embodiment the desired protein can be part of a fusion protein comprising the 
desired protein preceded at its NH 2 -terminus by at least part of the endoxylanase 
II protein. No mention is made of the production of ScFv fragments. 

Summary of the invention 

30 The present invention provides a process for producing fusion proteins comprising 
ScFv fragments by a transformed mould, in which (a) the mould belongs to the 
genus Aspergillus, and (b) the Aspergillus contains a DNA sequence encoding the 
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ScFv fragment under control of at least one expression and/or secretion regulating 
region derived from a mould selected from the group consisting of promoter 
sequences, terminator sequences and signal sequence-encoding DNA sequences, 
and functional derivatives or analogues thereof, optionally followed by a proteolytic 
5 cleavage step for separating the ScFv fragment part from the fusion protein. In one 
embodiment the "at least one expression and/or secretion regulating region derived 
from a mould" comprises the combination of both a promoter sequence and a 
signal sequence-encoding DNA sequence derived from a glucoamylase gene ex 
Aspergillus plus a terminator sequence of a trpC gene ex Aspergillus or m least one 

10 functional derivative or analogue thereof. In another embodiment the "at least one 
expression and/or secretion regulating region derived from a mould" is selected 
from a promoter, a signal sequence-encoding DNA sequence and a terminator 
sequence derived from an endoxylanase gene ex Aspergillus, especially from the 
endoxylanase II gene (exlA gene) of Aspergillus niger var. awamori present on the 

15 above mentioned plasmid pAW14B or at least one functional derivative or 
analogue thereof. 

In a preferred embodiment of the present invention the DNA sequence encoding 
the ScFv fragment forms part of a chimeric gene encoding a fusion protein, 
whereby said DNA sequence encoding the ScFv fragment is preceded at its 5' end 

20 by at least part of a structural gene encoding the mature part of a secreted mould 
protein, especially a mature Aspergillus protein, e.g. the mature glucoamylase 
protein or the mature endoxylanase protein. If the ScFv fragment in the fusion 
protein is connected or bound to said secreted mould protein or part thereof by a 
proteolytic cleavage site, e.g. a KEX2-like site, it is possible to remove the mould 

25 protein or part thereof from the ScFv fragment, so that the resulting antibody 
fragment is as small as possible, which can have significant advantages in 
applications. In this case the process according to the invention includes a 
proteolytic cleavage step for separating the ScFv fragment part from the fusion 
protein following the production of the fusion protein containing the ScFv 

30 fragment. It was found that production levels of at least 40 mg ScFv fragment per 
litre, or even at least 60 mg/1, and a highest yield of slightly more than 90 mg/1 
could be obtained (see Table 2 below), but it is envisaged that after further 
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optimization at least 150 mg/1 can be achieved by cultivation in shaked flasks. 
Further, production levels of more than 150 mg ScFv fragment per litre were 
already obtained with cultivation in a fermenter; it is therefore envisaged that after 
further optimization at least 250 mg/1, or even at least 500 mg/1, and probably 
5 more than at least 1 g/1 will be obtainable . 

The invention also provides new products comprising an ScFv fragment or fusion 
product thereof obtainable by a process according to the invention. Such new 
product can be one in which the ScFv fragment is alnodified ScFv fragment 

10 comprising complementary determining regions (CDRs) grafted on the framework 
regions of the variable fragments of an other ScFv fragment that is well expressed 
and secreted by a lower eukaryote, especially a mould of the genus Aspergillus. 
The invention also provides a composition, in particular consumer products of 
which examples are given above, containing a product produced by a process 

15 according to the invention or a new product as described above. According to a 
special embodiment of the invention the ScFv fragment recognizes a compound 
present in the human eco-system, which compound can be a microorganism, an 
enzyme or another protein. One preference is for compounds present in the oral 
cavity, and more preferably for compounds involved in the formation of plaque, 

20 caries, gingivitis, periodontal diseases, or bad breath. Another preference is for 

compounds present on the human skin, more preferably compounds involved in the 
formation of malodour, inflammation or hair loss. Another special embodiment of 
the invention relates to a composition, which can be used for diagnostic purposes 
and in which the compound is a hormone, especially human chorionic 

25 gonadotropin (HCG). 

According to another embodiment of the invention the ScFv fragment recognizes a 
compound present in the eco-system of domestic and agricultural animals which 
compound can be an animal feed component, an enzyme or another protein, or a 
disease causing agent. 

30 According to still another embodiment of the invention a composition is provided 
in which the ScFv fragment recognizes a compound that has a positive or negative 
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relationship with a disease or disorder and can for example be used for detection 
and/or targeting purposes. 

The invention also relates to a composition according to the invention which can 
be used in the chemical, petrol or pharmaceutical industry as a catalyst or for 
5 detection purposes. 

Although the invention was developed on the basis of the production of ScFv 
fragments in a mould of the genus Aspergillus, as will be illustrated in the Examples 
below, it is envisaged that the invention will also be applicable to other moulds, _ 
especially selected from the~genera Mucor, Neurospora, and Penicillium. 

10 

Brief description of the figures 

Figure 1 Schematic drawing of pAN52-10. 
Figure 2 Schematic drawing of pUR4155 and pUR4157. 
Figure 3 Schematic drawing of pAN56-7. 
15 Figure 4 Schematic drawing of pUR4159 and pUR4161. 

Figure 5 Western blot. After gelelectrophoresis on a 12.5% SDS-PAGE gel 

proteins reacting with Fv-lysozyme antiserum are visualized. 

Lane 1: E. coli extract containing ScFv-lysozyme; Lane 2: Fv-lysozyme; 

Lanes 3 to 8 contain medium samples of AWC(M)41 transformants and 
20 the A. niger var. awamori mutant #40 strain; Lane 3 and 4: transformant 

AWC(M)4161 (prepro-"glaA2 M -KEX-ScFv-HCG); Lane 5: AWC4159 

(prepro-"glaA2"-KEX-ScFv-LYS); Lane 6: mutant #40; Lane 7: 

AWC4157 (18aa glaA-ScFv-HCG); Lane 8: AWC4155 (18aa glaA-ScFv- 

LYS). 

25 Figure 6 Map of plasmid pAWl4B obtained by insertion of the 5.3 kb Sail 

fragment comprising the exlA gene of Aspergillus niger var. awamori in the 
Sail site of pUC19. 

Figure 7 Coomassie Brilliant Blue-stained polyacrylamide gel showing proteins 
present in the culture medium of an Aspergillus niger var. awamori 
30 transformed with pUR4462; also indicated are the bands representing 

(i) the released ScFv-LYS fragment, and 
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(ii) the glaA-KEX2-ScFv-LYS fusion protein and/or the truncated 
glaA protein. 

Detailed description of the invention 

5 It has now been found that the development described above by M. Ward et al 
(1990) and in WO 90/15860 (in which the gene .encoding the desired protein forms 
part of a chimeric gene further comprising a gene encoding the glucoamylase 
protein) as well asahe above described preferred embodiment of the invention 
described in UNILEVER'S above mentioned not prior-published WO 93/12237 (in 

10 which the gene encoding the desired protein forms part of a chimeric gene further 
comprising a gene encoding at least part of the endoxylanase protein) can be 
applied advantageously for the production of ScFv fragments, so that the desired 
protein is the ScFv fragment. This is particularly so, when in the resulting fusion 
protein a proteolytic cleavage site is present between the secreted mould protein 

15 part or fragment thereof and the ScFv part. A preferred cleavage site is a KEX2- 
like site as described by Fuller et al (1988), Contreras et al (1991) and Calmels et 
al (1991), but other cleavage sites can also be used provided that they are not 
present in the ScFv fragment. Other cleavage sites can be selected on the basis of 
the method described by Matthews & Wells (1993). In the Examples given below 

20 the pro part of the prepro-glucoamylase protein comprises a KEX2-type 
recognition site, see Example 2.4 (i). 

ScFv fragments that recognize microorganisms present in the oral cavity or on the 
skin of human beings are important in the framework of this invention, because 

25 they have potential to inhibit the growth or metabolism of these microorganisms. 
Certain microorganisms present in the oral cavity are thought to be involved in the 
formation of plaque, caries, gingivitis or periodontal diseases, etc., whereas 
microorganisms on the human skin are involved in, amongst others, the generation 
of malodour. The ScFv fragments prepared according to the invention may exert 

30 their action either as such, or bound to other compounds that have an inhibitory 
effect on said microorganisms. 
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It is also envisaged that according to the present invention other modified ScFv 
fragments can be made by grafting a complementary determining region (CDR) on 
the framework regions of the variable fragments of an ScFv fragment that is well 
expressed and secreted in Aspergillus] compare grafting of CDR's on human im- 

5 munoglobulins as described by e.g. Jones et al, (1986). These CDR's can be 

obtained from common antibodies. Both the binding properties of a CDR and the 
remainder of the ScFv fragment can be optimized by random or directed 
mutagenesis. Thus in a process according to the invention CDR's. originating from 
one antibody can be grafted on the framework regions of the variable fragments of 

10 another ScFv fragment. 

Some ScFv fragments or fusion products thereof produced by a process according 
to the invention may be old, but many of the ScFv fragments or fusion products 
thereof will be new products. Thus the invention also provides new ScFv fragments 
15 or fusion products thereof obtainable by a process according to the invention. 
The products resulting from such process can be used in compositions for various 
applications. Therefore, the invention also relates to compositions containing a 
product produced by a process according to the invention. This holds for both old 
products and new products. 

20 

Instead of the combination of an exlA promoter, an exlA signal sequence-encoding 
DNA sequence, and an exlA terminator exemplified in Examples 3 and 5, also 
other combinations can be used e.g. an exlA promoter, an glaA signal sequence- 
encoding DNA sequence, and an exlA terminator as exemplified in Example 7, but 

25 in general a selection can be made from any mould-derived promoter, mould- 
derived signal sequence-encoding DNA sequence, and mould-derived terminator 
sequence as expression and/or secretion regulating regions. A specific embodiment 
is a combination of both a promoter sequence and a signal sequence-encoding 
DNA sequence derived from a glucoamylase gene ex Aspergillus plus a terminator 

30 sequence of a trpC gene ex Aspergillus. 

The secreted mould protein forming part of a fusion protein according to the 
invention can in general be derived from any secreted mould protein in addition to 



SUBSTITUTE SHEET (RULE 26) 



WO 94/29457 



9 



PCT/EP94/01906 



the exemplified endoxylanase II protein ex Aspergillus niger var. awamori (see 
Examples 3 and 5) and the exemplified glucoamylase ex Aspergillus (see Example 
7). 

Table 2 in Example 2.6.1b shows that the highest expression and secretion yield 
5 was obtained when the mould protein was composed of its prepro part followed by 
an appreciable part of its mature protein, which was connected to the ScFv 
fragment by again the pro part of the mould protein containing a KEX2-like 
cleavage site. A small linker peptide may be situated between the ScFv fragment 
and the KEX2-like cleavage site (see plasmids pUR4159 and pUR4T63 and 

10 derivatives) or between the latter and the part of the mature mould protein. 
Thus in its broadest sense the invention provides a process for producing fusion 
proteins comprising ScFv fragments by a transformed mould, in which the mould 
belongs to the genus Aspergillus, and the Aspergillus contains a DNA sequence 
encoding the ScFv fragment under control of at least one expression and/or 

15 secretion regulating region derived from a mould selected from the group 
consisting of promoter sequences, terminator sequences and signal sequence- 
encoding DNA sequences, or functional derivatives or analogues thereof. 

The invention will be illustrated by the following Examples. 

20 

Example 1 Isolation of the antibody gene fragments encoding the V H and V L 

regions and the construction of ScFv genes. 
The isolation of RNA from the hybridoma cell lines, the preparation of cDNA and 

25 amplification of gene fragments encoding the variable regions of the heavy (V H ) 
and light (V L ) chains of the antibodies by PCR, was performed according to 
standard procedures known from the literature (see e.g. Orlandi et al, 1989). The 
general procedures described in the Examples were performed according to 
Sambrook et al, unless otherwise indicated. 

30 After cloning the V n and V L gene fragments and determining the nucleotide 
sequence, they can be used to construct expression plasmids encoding e.g. Fv or 
ScFv antibody fragments. In the ScFv antibody fragments, the V H and the V L 
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chains are connected via a peptide linker. This is achieved by constructing a 
(chimeric) gene in which the gene fragments encoding the V H and V L chains are 
connected with a nucleotide sequence encoding the linker peptide. The order of 
the variable chains can be V H -linker-V L or V L -linker-V H . In the following 
5 experiments the peptide linker with the sequence (GGGGS) 3 is used (SEQ. ID. 
NO: 1). 

1.1 Construction-of ScFv anti-lysozyme 

Plasmid pScFv-LYS-myc was obtained from-G. Winter and was described by S. 

10 Ward et cd., (1989). This pUC19-derived plasmid contains a gene fragment 

encoding the V H and V L fragments of the anti-Hen egg white lysozyme antibody 
DL3. The V H fragment is preceded by the PelB secretion signal sequence, the V H 
and V L fragments are connected via the (GGGGS) 3 peptide linker (SEQ. ID. NO: 
1) and the V L fragment is extended with an 11 amino acids myc-tag. The 

15 nucleotide sequence (SEQ. ID. NO: 2) and the deduced amino acid sequence 
(SEQ. ID. NO: 3) of the HindUl-EcoRl fragment encoding the ScFv fragment of 
the monoclonal anti-lysozyme antibody D1.3, preceded by the PelB signal sequence 
and followed by the myc-tail are given below. 



20 Nucleotide and deduced amino acid sequence of ScFv-LYS-myc 

Hindlll . 

1 AAGCTTGCATGCAAATTCTATTTCAAGGAGACAGTCATAATG AAATACCT 5 0 

M K Y L 
> PelB ss 



25 



35 



40 



51 ATTGCCTACGGCAGCCGCTGGATTGTTATTACTCGCTGCCCAACCAGCGA 100 
LPTAAAGLLLLAAQPA 



30 • PstI . 

101 TGGCCCAGGTGCAG CTGCAG GAGTCAGGACCTGGCCTGGTGGCGCCCTCA 150 
MAQVQLQESGPGLVAPS 
> Vh 



151 CAGAGCCTGTCCATCACATGCACCGTCTCAGGGTTCTCATTAACCGGCTA 200 
QSLSITCTVSGFSLTGY 

> 
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201 TGGTGTAAACTGGGTTCGCCAGCCTCCAGGAAAGGGTCTGGAGTGGCTGG 250 
GVNWVRQPPGKGLEWL 
CDR I < 



251 GAATGATTTGGGGTGATGGAAACACAGACTATAATTCAGCTCTCAAATCC 300 
GMIWGDGNTDYNSALKS 
> CDR II < 



301 AGACTGAGCATCAGCAAGGACAACTCCAAGAGCCAAGTTTTCTTAAAAAT 350 
RLSISKDNSKSQV FLKM 

351 GAACAGTCTGCACACTGATGACACAGCCAGGTACTACTGTGCCAGAGAGA 400 
NSLHTDDTARYY CARE 

> 

BstElI 

20 401 GAGATTATAGGCTTGACTACTGGGGCCAAGGCACCAC GGTCACC GTCTCC 450 
RDYRLDYWGQGTTVTVS 
CDR III < 



25 4 51 TCAGGTGGAGGCGGTTCAGGCGGAGGTGGCTCTGGCGGTGGCGGATCGGA 500 
SGGGGSGGGGSGGGGSD 
> Linker < > 

SscI • • • • » 

30 501 CATCGAGCTCACTCAGTCTCCAGCCTCCCTTTCTGCGTCTGTGGGAGAAA 550 
IELTQSPASLSASVGE 
VI 



35 551 CTGTCACCATCACATGTCGAGCAAGTGGGAATATTCACAATTATTTAGCA 600 
TVTITCRASGNIHNYLA 

> CDR I < 



40 601 TGGTATCAGCAGAAACAGGGAAAATCTCCTCAGCTCCTGGTCTATTATAC 650 
WYQQKQGKSPQLLVYYT 

> 



45 651 AACAACCTTAGCAGATGGTGTGCCATCAAGGTTCAGTGGCAGTGGATCAG 700 
TTLADGVPSRFSGSGS 
CDR II < 



50 701 GAACACAATATTCTCTCAAGATCAACAGCCTGCAACCTGAAGATTTTGGG 750 
GTQYSLKINSLQPE'DFG 



751 AGTTATTACTGTCAACATTTTTGG AGTACTCCTCGG ACGTTCGGTGGAGG 800 
55 SYYCQHFWSTPRTFGGG 

> CDR III < 
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Xhol . . . . 

801 CACCAAGCTCGAGATCAAACGGGAACAAAAACTCATCTCAGAAGAGGATC 850 
TKLEIKREQKLISEED 

> myc tail 

5 

. Bell . BairiHI EcoRI 

851 TGAATTAATAATGATCAAACGGTAATA AGGATCC AGCTCGAATTC 895 
L N * * * 

10 

In order to remove the myc-tag of pUC19-derived pScFv-LYS-myc the XhoVEcdRl 

fragment was replaced by a new synthetic fragment having the following sequence : 

E I R * * ~~ (SEQ. ID. NO: 6) — 

5»- TC GAG ATC AAA CGG TAA TGA G -3' (SEQ. ID. NO: 4) 

15 3'- C TAG TTT GCC ATT ACT CTT AA -5' (SEQ. ID. NO: 5) 

Xhol EcoRI 

introducing a TAA translation termination codon after the V L -gene fragment. The 
obtained plasmid was named pUR4121. Subsequently, the about 820 bp Hindlll- 
EcoRl fragment encoding the ScFv-LYS was isolated and cloned into a pEMBL9- 
20 derived plasmid (Dente et aL, 1983), which was digested with the same enzymes, 
resulting in plasmid pUR4129. 

1.2 Construction of a gene encoding ScFv anti-human chorionic 
gonadotropin 

25 Human chorionic gonadotropin (HCG) is a pregnancy hormone. A pregnancy test 
kit based on the detection of HCG in urine by using monoclonal antibodies was 
developed by Unilever and is marketed by UNIPATH under the trade name 
Clearblue®. Gene fragments, encoding the variable regions of the heavy and light 
chain fragments from the monoclonal antibody directed against the human 

30 chorionic gonadotropin were obtained from a hybridoma cell line in a way as 

described above. Subsequently, these HCG V H and V L gene fragments were cloned 
into plasmid pUR4129 by replacing the corresponding Pstl-BstEU and Sacl-Xhol 
anti-lysozym gene fragments, resulting in plasmid pUR4138. The nucleotide 
sequence (SEQ. ID. NO: 7) and the deduced amino acid sequence (SEQ. ID. NO: 

35 8) of the Pstl-Xho\ gene fragment encoding the ScFv fragment of the anti-human 
chorionic gonadotropin (anti-HCG) antibody is given below. 
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Nucleotide sequence and deduced amino acid sequence of ScFv-HCG 
PstI . 

1 CTGCAGGAGTCTGGGGGACACTTAGTGAAGCCTGGAGGGTCCCTGAAACT 50 
LQESGGHLVKPGGSLKL 



51 CTCCTGTGCAGCCTCTGGATTCGCTTTCAGTAGCTTTGACATGTCTTGGA 100 
SCAASGFA FSSFDMSW 
10 > CDR I < 



_101 TTCGCCAGACTCCGGAGAAGAGGCTGGAGTGGGTCGCAAGCATTACTAAT 150- 
IRQTPEKRLEWVASITN 
15 > 



151 GTTGGTACTTACACCTACTATCCAGGCAGTGTGAAGGGCCGATTCTCCAT 200 
VG TYTYYPGSVKGRFSI 
20 CDR II < 



2 01 CTCCAGAGACAATGCCAGGAACACCCTAAACCTGCAAATGAGCAGTCTGA 250 
25 SRDNARNTLNLQMSSL 



251 GGTCTGAGGACACGGCCTTGTATTTCTGTGCAAGACAGGGGACTGCGGCA 300 
RSEDTALYFCARQGTAA 
30 > 

• • * > Bst EI I • 

301 CAACCTTACTGGTACTTCGATGTCTGGGGCCAAGGGACCACGGTCACCGT 350 
QPYW YFDVWGQGTTVTV 
35 CDR III < 



351 CTCCTCAGGTGGAGGCGGTTCAGGCGGAGGTGGCTCTGGCGGTGGCGGAT 400 
SSGGGGSGGGGSGGGG 
40 > Linker 

Sad , 

401 CGGACATCGAGCTCACCCAGTCTCCAAAATCCATGTCCATGTCCGTAGGA 450 
SDIELTQSPKSMSMSVG 
45 < > VI 



451 GAGAGGGTCACCTTGAGCTGCAAGGCCAGTGAGACTGTGGATTCTTTTGT 500 
ERVTLSCKASETVDS^FV 
50 > CDR I 
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501 GTCCTGGTATCAACAGAAACCAGAACAGTCTCCTAAATTGTTGATATTCG 550 
SWYQQKPEQSPKLLI F 

< > 



551 GGGCATCCAACCGGTTCAGTGGGGTCCCCGATCGCTTCACTGGCAGTGGA 600 
GASNRFSGVPDRFTGSG 
CDR II < 

10 

• • • • • 

601 TCTGCAACAGACTTCACTCTGACCATCAGCAGTGTGCAGGCTGAGGACTT 650 
SATDFTLTISSVQAEDF 

15 

651 TGCGGATTACCACTGTGGACAGACTTACAATCATCCGTATACGTTCGGAG 700 
ADYHCGQTYNHPYTFG 
> CDR III < 

20 • Xhol 

701 GGGGGACCAAGCTCGAG 717 
G G T K L E 



25 



Example 2 Construction of ScFv expression cassettes, using the glaA 

promoter system and introduction into Aspergillus. 
2.1 Construction of ScFv expression cassettes using the 18 amino acid signal 
30 sequence of glucoamylase (pUR4155 and pUR4157) 

The multiple cloning site of plasmid pEMBL9 (ranging from the EcoRl to the 
HindUl site) was replaced by a synthetic DNA fragment having the following 
nucleotide sequence. 



35 Nucleotide sequence for synthetic EcoRl-Hindlll fragment cloned in pEMBL9 and 
used for preparing pUR4153 

18 amino acid signal sequence of 

MGFRSLLALSGLV 
AAT TC C ATG GGC TTC CGA TCT CTA CTC GCC CTG AGC GGC CTC GTC — 
40 GG TAC C CG AAG GCT AGA GAT GAG CGG GAC TCG CCG GAG CAG — 

EcoTRl Ncol 



45 



SUBSTITUTE SHEET (RULE 26) 



WO 94/29457 



PCT/EP94/01906 



15 



alucoamvlase N-term ScFv C-term 

CTGLAQVQLQ *VTK 

— TGC ACA GGG TTG GCA CAG GTG CAG CTG CAG TAA GTG ACT AAG 

— ACG TGT CCC AAC CGT GTC CAC GTC GAC GTC ATT CAC TGA TTC 
5 Pstl 



ScFv 

L E I K R * * (SEQ. ID. NO: 11-12) 

10 ~ CTC GAG ATC AAA CGG TGA TA (SEQ. ID. NO: 9) 

— GAG CTC TAG TTT GCC ACT ATT CGA (SEQ. ID. NO: 10) 

Xhol Jfindlll 



15 The 5'-part of the nucleotide sequence codes for the glaA signal sequence (amino 
acid 1 to 18), followed by the first 5 amino acids of the variable part of the 
antibody heavy chain. The 3'-part encodes the last 5 amino acid residues of the 
variable part of the antibody light chain. The resulting plasmid was named 
pUR4153. 

20 Plasmids pUR4154 and pUR4156 were obtained in the following way: After 

digestion of plasmid pUR4129 (Example 1.1) with Pstl and Xhol, an about 0.7 kb 
DNA fragment was isolated from agarose gel. This fragment codes for a truncated 
ScFv-LYS fragment missing DNA sequences encoding the 5 N-terminal and 5 C- 
terminal amino acids. In the same way an about 0.7 kb Pstl-Xhol fragment was 

25 isolated from plasmid pUR4138 (Example 1.2), which encodes for a similarly 
truncated ScFv-HCG fragment. 

In order to fuse the ScFv encoding fragments with the glaA secretion signal- 
encoding sequence, the obtained fragments were cloned into pUR4153. To this end 
plasmid pUR4153 was digested with Pstl and XIiol, after which the about 4.1 kb 
30 vector fragment was isolated from an agarose gel Ligation with the about 0.7 kb 
Pstl-Xhol fragments resulted in plasmids pUR4154 (ScFv-LYS) and pUR4156 
(ScFv-HCG), respectively. 

2.2 Construction of pAN52-10 
35 pAN52-10 (Figure 1) was used as starting vector for the construction of the 
Aspergillus expression cassettes. This plasmid was constructed as follows: 
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In pAN52-6M?rI (Van den Hondel et al, 1991) the Ncol site located in the glaA 
promoter of A. niger N402 (about 2.7 kb upstream of the ATG) was removed by 
cleaving with Ncol and filling in with Klenow polymerase, resulting in pAN52- 
6Notl delta Ncol. After digestion of pAN52-6AfaI delta Ncol with Noil and partial 

5 digestion with Xmnl an about 4.0 kb Notl-Xmnl glaA promoter fragment was 
isolated. Three-way ligation of this pAN52-6AforI delta Ncol fragment (1) with an 
about 3.4 kb Notl-Ncol fragment (2) of pAN52-lAfc/I (Van den Hondel, C.A.M.J J. 
et al\ 1991), comprising the A nidulans trpC terminate (Punt, J.P._ef al\ 1991) and 
pUC18-sequences, and with a synthetic Xmnl-Ncol fragment (3) comprising the 3'- 

10 end of the glaA promoter to the ATG initiation codon, resulted in plasmid pAN52- 
INotl. The nucleotide sequence (SEQ. ID. NO: 13-14) of this synthetic XmnVNcol 
fragment is given below. 

5'- GCT TC C TCC CTT TTA GAC GCA ACT GAG AGC CTG 

15 3 1 - CGA AGG AGG GAA AAT CTG CGT TGA CTC TCG GAC 



After isolating both the about 4 kb Notl-Ncol fragment (comprising the glaA 
promoter) and the about 3.4 kb NothBamUl fragment (comprising the pUC18 
vector and the trpC terminator) from pAN52-7Afod, the fragments were ligated 
25 together with the Ncol-BamHl linkers containing an EcoRV site and an Hindlll 
site and having the following nucleotide sequences (SEQ. ID. NO: 15-16). 

5 1 - CAT GG C C GA TAT C GC AAG CTT CCG -3 1 

3 1 - CG GCT ATA GCG TTC GAA GG C CTAG -5 1 

30 Ncol EcoRV Hindi 1 1 BamKI 

This resulted in plasmid pAN52-9. Ligation of the about 4.0 kb Notl-Hindlll glaA 
promoter fragment of pAN52-9 with an about 3.3 kb Hindlll-Notl fragment of 
pAN52-6AfofI containing both pUC18-sequences and an about 0.7 kb trpC 
35 terminator fragment of A. nidulans resulted in pAN52-10 (Figure 1). 



Xmnl 



20 



AGG TTC ATC 

TCG AAG TAG 



CCC AGC ATC ATT ACA CCT GAG C 
GGG TGG TAG TAA TGT GGA GTC GGT AC 

NCOI 
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2.3 Construction of pUR4155 and pUR4157. 

Plasmid pAN52-10 was digested with Ncol and Hindlll and the dephosphorylated 
vector fragment of about 7.5 kb was isolated. The Ncol site is located downstream 
of the glaA promoter and coincides with the ATG initiation codon. The plasmids 

5 pUR4154 and pUR4156 (see Example 2.1) were digested with Ncol and Hindlll 
and the about 0.8 kb fragments coding for the ss-glaA and the ScFv were isolated. 
Ligation of the obtained fragments resulted in plasmids pUR4155 and pUR4157, 
respectively (Figure 2). In these plasmids the expression of the ScFv fragments is 
under the controhrf thzA. niger glaA promoter, the 18 amino acid signal sequence 

10 of glucoamylase and the A. nidulans trpC terminator. 

2.4 Construction of ScFv expression cassettes using part of glucoamylase as 
a secretion carrier. 

i) Construction of pUR4159 and pUR4161. 

15 Expression cassettes encoding a fusion protein consisting of the glaA prepropart, 
the first 514 amino acids of the mature glucoamylase Gl protein ("glaA2" protein), 
and the ScFv fragments were constructed. In these cassettes the "glaA2" protein 
and the ScFv fragment were intersected by a sequence which encodes the 
propeptide of glucoamylase (Asn-Val-Ile-Ser-Lys-Arg; SEQ. ID. NO: 45) and which 

20 comprises a KEX2-type recognition site (Lys-Arg). To obtain these vectors, plasmid 
pAN56-7 (Figure 3) was constructed by insertion of a 1.9 kb Ncol-EcoRV fragment 
of pAN56-4, comprising part of the A. niger glaA gene into the about 7.5 kb Ncol- 
EcoRV fragment of pAN52-10. Plasmid pAN56-4 was not prior-published but its 
description is now available in the publication of M.P. Broekhuijsen, I.E. Mattern, 

25 R. Contreras, J.R. Kinghorn & C.A.M.J.J. van den Hondel in Journal of 

Biotechnology 31, No.2 (1993) 135-145, which is incorporated herein by reference; 
a copy of the draft paper was attached to the priority documents. 
To obtain in-frame fusions of the t, glaA2" protein and the ScFv fragments plasmids 
pUR4154 and pUR4156 were digested with EcoRl and Pstl, after which the vector 

30 fragment of about 4.8 kb was isolated from an agarose gel. The vector was ligated 
with a synthetic EcoRl-Pstl fragment having the following nucleotide sequence 
(SEQ. ID. NO: 17-19). 
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KEX2 spacer N-term ScFv 

ISKRGGSQVQLQ 
AAT TC G ATA TC G AAG CGC GGC GGA TCC CAG GTG CAG CTG CA 

GC TAT AGC TTC GCG CCG CCT AGG GTC CAC GTC G 

5 EcoRI EcoRV BamRI PstI 

This EcoRl-Pstl fragment was used to replace the fragment encoding the glaA 
signal sequence (see Example 2.1) and to allow an in-frame fusion to the "glaA2" 
gene. From the resulting plasmids, pUR4158 and pUR4160, the EcoRV-Hindlll 

10 fragments ^about 0.75 kb) were isolated and ligated into the £coRV-M/idIII 

fragment of pAN56-7 (about 9.3 kb), resulting in pUR4159and pUR4161 (Figure "~ 
4, in which the DNA encoding the 24 amino acid prepro glaA part in the 
neighbourhood of the Ncol site was not indicated). In the resulting protein the 
"glaA2" part and the ScFv part are connected by a peptide comprising a KEX2 

15 cleavage site. 

ii) Construction of pUR4163. 

In a similar way a vector was constructed with an expression cassette encoding a 
fusion protein consisting of the H glaA2" protein (preceded by its prepro part) fused 
20 to ScFv-lysozyme and intersected by a factor Xa recognition site. The EcoRl-Pstl 
vector fragment (about 4.8 kb) of pUR4154 was ligated with a synthetic EcoRhPstl 
fragment having the following nucleotide sequence (SEQ. ID. NO: 20-22). 



factor Xa spacer 
25 ISIEGRGGS 

AAT TC G ATA TC G ATC GAA GGT CGA GGC GGA TCC — 

GC TAT AGC TAG CTT CCA GCT CCG CCT AGG — 

EcdKl EcoRV Bamm 

30 — N-term ScFv 

— Q V Q L Q 

— CAG GTG CAG CTG CAG 

— GTC CAC GTC G 

PstI 

35 

This EcoRhPstl fragment was used to replace the fragment encoding the glaA 
signal sequence and to allow an in-frame fusion to the "glaA2" gene. In the 
encoded protein the "glaA2" part and the ScFv part are connected by a peptide 
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comprising a factor X cleavage site. From the resulting plasmid pUR4162, the 
EcoRW-HindlU fragment (about 0.75 kb) was isolated and ligated into the pAN56- 
7 vector fragment (about 9.3 kb), resulting in pUR4163. 

5 2.5 Aspergillus transformation 

The constructed vectors can be provided with conventional selection markers (e.g. 
amdS or pyrG> hygromycin etc.) and the fungus can be transformed with the 
resulting vectors to produce the desired protein. _ 



10 Table 1 



Expression vectors for the production of ScFv-anti-lysozym and ScFv-anti-human 
chorionic gonadotropin, resp., controlled by the A. niger glaA promoter and A. 
nidulans trpC terminator with A. nidulans amdS as selection marker 
15 



Plasmids 


ScFv- 


secretion-carrier 


cleavage of 




antibodv 




ScFv-antibpqy by 


pUR4155 


ScFv-LYS 


18 a.a. ss glaA 


signalpeptidase 


pUR4159 


ScFv-LYS 


prepro-"glaA2" 


KEX2-enzyme 


pUR4163 


ScFv-LYS 


as in pUR4159 


factor Xa 


pUR4157 


ScFv-HCG 


as in pUR4155 


signalpeptidase 


pUR4161 


ScFv-HCG 


as in pUR4159 


KEX2-enzyme 



As an example, the Aspergillus nidulans amdS gene (Hynes MJ. et al. 1983) located 

on a 5.0 kb Noil fragment was introduced in the unique Notl sites of the ScFv 

30 expression vectors pUR4155, pUR4157, pUR4159, pUR4161 and pUR4163 

yielding pUR4155NOT, pUR4157NOT, pUR4159NOT, pUR4161NOT and 

pUR4163NOT, respectively (Table 1). The amdS Noil fragment was obtained by 

flanking the £coRI fragment of pGW325 (Wernars K.; Ph.D. thesis 1986) with the 

following synthetic oligonucleotides. 

35 5'- GGCCGC TGTGCAG -3' (SEQ. ID. NO: 23) 

3'- CGACACGTCTTAA_ -5* (SEQ. ID. NO: 24) 

Notl EcoRl 
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The constructed pUR41..NOT vectors (pUR4155NOT, pUR4157NOT, 
pUR4159NOT, pUR4161NOT and pUR4163NOT) were subsequently transferred 
to Aspergillus niger wax. awamori ATCC 11358 (= CBS 115.52) and a mutant strain 
Aspergillus niger var. awamori # 40 (WO 91/19782) which has been obtained by 
5 mutagenesis of A. niger var. awamori. Transformation with pUR41NOT plasmids 
was carried out as described in WO 91/19782 or by means of co-transformation 
with plasmid pAN7-l according to Punt P. J. and Van den Hondel C. A.M. J J. 
(1992). pAN7-l comprises the hygromycin resistance gene of £. coli flanked by 
Aspergillus expression signals. The yield of A. niger var. awamori (mutant #40) 

10 protoplasts was 1-5 x 10 7 /g mycelium and the viability was 3-8%. Per 

transformation 3-8 x 10 5 viable protoplasts were incubated with 10 \ig plasmid 
DNA purified by the Qiagen method. A. niger var. awamori mutant #40 AmdS + 
transformants were selected and purified on plates with minimal medium and 
acetamide or acrylamide as sole nitrogen source. Direct selection resulted in up to 

15 0.02 mutant #40 transformants per |ig DNA. No A. niger var. awamori 

transformants were obtained. Co-transformation of the mutant #40 strain was 
performed with a mixture of one of the pUR41..NOT plasmids and pAN7-l DNA 
in a weight ratio of 7:3. pAN7-l co-transformants were selected primarily on 
minimal medium plates containing 100-150 ng/ml hygromycin, followed by 

20 selection on plates with acetamide. The frequency of Hm R colonies was about 2 
transformants per jig, however only 5% of the Hm R colonies grew well on plates 
with acetamide. 

A. niger var. awamori mutant #40 transformants obtained by direct selection on 
plates with acetamide are called AWC. Mutant #40 co-transformants growing well 
25 on acetamide are called AWCM. 



The following number of (co- 


•)transformants were further analyzed: 


Number of transformants 


Number of co-transformants 


AWC4155* 3 


AWCM4155 


3 


AWC4157 7 


AWCM4157 


1 


AWC4159 2 


AWCM4159 


5 


AWC4161 2 


AWCM4161 


2 




AWCM4163 


2 


* 4155 indicates the presence of plasmid pUR4155NOT in the mutant #40 strain. 
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2.6 ScFv production by Aspergillus transformants 

Analysis of Aspergillus nigerwzx. awamori mutant # 40 transformants containing 
ScFv-fragment encoding sequences after culturing in medium with maltodextrin as 
an inducer. 

5 AWC and AWCM transformants were grown in minimal medium (0,05% MgS0 4 , 
0,6% NaN0 3 , 0,05% KC1, 0,15% KH 2 P0 4 and trace elements) with 5% 
maltodextrin (Sigma Dextrin Corn type I; D-2006). Media were sterilized for 30 
__min at 120°Cr Fifty ml medium (shake flask 300 ml) were inoculated with 4 x 10 s 
spores/ml, followed by culturing iiran air incubator (300 rpm) at 30°C for different 
10 periods. Medium samples were taken after 45 to 50 hours and analyzed by SDS- 
PAGE followed by Western blot analyses. Furthermore a quantitative functional 
test was carried out by performing a Pin-ELISA assay. 

2.6.1 Medium of ScFv-LYS and ScFv-HCG transformants 
15 2.6.1a Western blot analysis and Coomassie Brilliant Blue-stained gels 

Western blot analysis of medium samples of AWC(M)4155 (18 a.a. glaA signal 
sequence-ScFv-LYS) (co-)transformants -in which anti-serum directed against Fv- 
LYS was used- revealed a band with a molecular mass of about 31 kDa which is 
absent in the medium of the mutant strain #40 (Figure 5). The presence of this 
20 band, which runs at the position of a protein with the expected size, points at 
secretion of ScFv-LYS in the culture medium. 

In medium of several AWC(M)4159 (prepro-'*glaA2"-KEX2-ScFv-LYS) (co-)trans- 
formants a similar, much stronger, band was found indicating a more efficient 
secretion of ScFv-LYS by these transformants. This protein band was also visible 

25 on Coomassie Brilliant Blue-stained gels. 

In medium samples of AWC(M)4157 (18 aa. glaA signal sequence + ScFv-HCG) a 
faint band was found, while the band in medium of AWC(M)4161 (prepro-"glaA2"- 
KEX2-ScFv-HCG) (co-)transformants was clearly visible (molecular mass about 31 
kDa). The aspecific signals were identical to the ones obtained with ScFv-LYS 

30 transformants. Some of the results are shown in Figure 5 (Western blot). 

Method : SDS-PAGE was carried out on 8-25% gradient gels using the Pharmacia 
Phast system or on homogeneous 12.5% home-made SDS-gels. For Western blot 
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analysis a polyclonal anti-serum against Fv-LYS was used (1:1500) for the 
detection of both ScFv-LYS and ScFv-HCG. 



2.6.1b Analysis by PIN-ELISA 

5 The amount of functional ScFv-LYS (as determined by a PIN-ELISA assay) in the 
medium of AWC(M) transformants is given in Table 2. 



10 







Table 2 


— 


Transformant: 




construct 


ScFv-fragment 








mg/1 


AWCM4155 


#102 


18 a.a. ss-glaA-ScFv-LYS 


15 - 22 -11 


AWCM4155 


#105 


same 


3 


AWC 4155 


# 4 


same 


10 


AWC 4155 


# 5 


same 


2 


AWCM4159 


#101 


prepro-"glaA2"-KEX2-ScFv-LYS 


91 - 66 - 67 


AWCM4159 


#608 


same 


3 


AWCM4159 


#610 


same 


16 


AWC 4159 


#701 


same 


40 


AWCM4161 


#612 


prepro-"glaA2"-KEX2-ScFv-HCG 


4 


AWC 4161 


# 2 


same 


1 


A. niger var. awamori mutant #40 


0 



30 The amount of ScFv-LYS in medium of AWC(M)4155 (18 a.a. glaA) transformants 
ranged from 2 to 22 mg/1. AWC(M)4159 (co-)transformants (prepro-"glaA2 M - 
KEX2-construction) secrete up to about 90 mg/1 into the medium, while no 
production was found for the A. niger var. awamori mutant #40 strain. 
With the quantitative PIN-ELISA assay for the determination of ScFv-HCG it was 

35 found that AWC(M)4161 (co-)transformants ( H glaA2 M -KEX2-construction) secreted 
up to 4 mg/1 functional ScFv-HCG into the medium. However, in the medium of 
AWC4157 (18 aa glaA signal sequence) transformants no ScFv-HCG was detected. 
Method : PINs coated with either lysozyme or HCG were incubated with (diluted) 
medium samples. Subsequently the PINs were incubated with antiserum against Fv- 
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LYS and Fv-HCG respectively, then with goat-anti-rabbit conjugate with alkaline 
phosphatase. Finally the alkaline phosphatase enzyme-activity was determined after 
incubation with p-nitro-phenyl phosphate and the optical density was measured at 
405 nm. Using standard solutions of Fv-LYS and Fv-HCG respectively, the amount 
5 of functional ScFv-LYS and ScFv-HCG was calculated. 



Example 3 Construction of Aspergillus niger var. awamori integration vectors 
for the productionnof ScFv fragments, using the endoxylanase pro- 
10 moter and terminator and a DNA sequence encoding the endo- 

xylanase secretion signal and the mature endoxylanase protein. 

Although this Example describes the construction of expression plasmids encoding 
fusion proteins between the mature endoxylanase protein and the ScFv fragment it 
is obvious that alternative expression plasmids can be constructed in much the 
15 same way in which only part of the endoxylanase protein is used. 

3.1 Construction of pUR4158-A. 

After digesting plasmid pScFvLYSmyc (see Example 1.1) with Pstl and XhoI> an 
about 0.7 kb Pstl-Xhol fragment could be isolated from agarose gel. This fragment 
20 codes for a truncated Single Chain Fv-Lys fragment missing the first 5 and the last 
5 amino acids (see the nucleotide sequence (SEQ. ID. NO: 25) and deduced amino 
acid sequence (SEQ. ID. NO: 26) of the about 700 bp Pstl-Xhol fragment encoding 
the ScFv fragment of the monoclonal anti-lysozyme antibody D1.3 (ScFv LYS) 
given below. 



25 



Nucleotide sequence and deduced amino acid sequence of ScFv LYS 
JPstl • • • • . 

1 CTGCAG GAGTCAGGACCTGGCCTGGTGGCGCCCTCACAGAGCCTGTCCAT 50 
30 LQESGPGLVAPSQSLSI 



5 1 CACATGCACCGTCTCAGGGTTCTCATTAACCGGCTATGGTGT AAACTGGG 100 
TCTVSGFSLTGYGVNW. 
35 > CDR I < 
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TTCGCCAGCCTCCAGGAAAGGGTCTGGAGTGGCTGGGAATGATTTGGGGT 150 
VRQPPGK GLEWLGMIW G 

> 



GATGGAAACACAGACTATAATTCAGCTCTCAAATCCAGACTGAGCATCAG 200 
DGNTDYNSALKSRLSIS 
CDR II < • 



CAAGGACAACTCCAAGAGCCAAGTTTTCTTAAAAATGAACAGTCTGCACA 250 
KDNSKSQVFLKMNSLH 



CTGATGACACAGCCAGGTACTACTGTGCCAGAGAGAGAGATTATAGGCTT 3 00 
TDDTARYYCARERDYRL 

> CDR III 

• • J5s£EII • • • 

GACTACTGGGGCCAAGGCACCACGGTCACCGTCTCCTCAGGTGGAGGCGG 350 
DYWGQGTTVTVSSGGGG 
< > 

.Sad 

TTCAGGCGGAGGTGGCTCTGGCGGTGGCGGATCGGACATCGAGCTCACTC 400 

SGGGGSGGGGSDIELT 
Linker < > VI 



AGTCTCCAGCCTCCCTTTCTGCGTCTGTGGGAGAAACTGTCACCATCACA 4 50 
QSPASLSASVGETVTIT 



TGTCGAGCAAGTGGGAATATTCACAATTATTTAGCATGGTATCAGCAGAA 500 
CRASGNIHNYLAWYQQK 
> CDR I < 



ACAGGGAAAATCTCCTCAGCTCCTGGTCTATTATACAACAACCTTAGCAG 550 
QGKSPQLLVYYTTTLA 

> CDR II 



ATGGTGTGCCATCAAGGTTCAGTGGCAGTGGATCAGGAACACAATATTCT 600 
DGVPSRFSGSGSGTQYS 



CTCAAGATCAACAGCCTGCAACCTGAAGATTTTGGGAGTTATTACTGTCA 650 
LKINSLQPEDFGSYYCQ 

> 
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. . . Xhol 

651 ACATTTTTGGAGTACTCCTCGGACGTTCGGTGGAGGCACCAAGCTCGAG 699 
HFWSTPRTFGGGTKLE 
CDR III < 

5 

The multiple cloning site of plasmid pEMBL9 (Dente et al, 1983), ranging from 
the EcoRI to the Hindlll site, can be replaced by a synthetic DNA fragment having 
the following nucleotide sequence (SEQ. ID. NO: 27-30). 

10 KEX2 Spacer ScFv N-term. 

- IS_KR-GGSQVQLQ* 
AAT TC G ATA TCG AAG CGC GGC GGA TCC CAG^GTG CAG CTG CAG TAA - 
G C TAT AG C TTC GCG CCG CCT AGG . GTC CAC GTC GAC GTC ATT - 
EcoRI EcoRV BamHX PstI 

15 

ScFv C-term. 

VTKLEIKR** 

- GTG ACT AAG CTC GAG ATC AAA CGG TGA TAA GCT CG C TTA 

- CAC TGA TTC GAG CTC TAG TTT GCC ACT ATT CGA GCG A AT TCG A 
20 Xhol Aflll Hindlll 

This DNA fragment can be used for replacing the multiple cloning site of plasmid 
pEMBL9 (ranging from the EcoRI to the Hindlll site). The 5*-part of the coding 
strand of the synthetic DNA fragment codes for the KEX2 recognition site (ISKR), 
a spacer (GGS) followed by the first 5 amino acids of the variable part of the 
25 antibody heavy chain. The 3-part of the coding sequence encodes the last 8 amino 
acid residues of the variable part of the antibody light chain. Upon digesting the 
obtained plasmid with Pstl and Xhol a vector fragment of about 4 kb can be 
isolated. 

Upon ligating the about 0.7 kb Pstl-Xhol fragment of pScFvLYSmyc with the about 
30 4 kb vector fragment, pUR4158-A can be obtained containing the restored genes 
encoding the V H and V L antibody fragments. 



32 Construction of pXYL2. 

Plasmid pAW14B was the starting vector for the construction of a series of 
35 expression plasmids containing exlA expression signals and genes coding for ScFv 
fragments. The plasmid comprises an Aspergillus niger var. awamori chromosomal 
5.2 kb Sail fragment on which the 0.7 kb exlA gene is located, together with 2.5 kb 
of 5'-flanking sequences and 2.0 kb of J-flanking sequences (see Figure 6 = Figure 
3 of UNILEVER'S not prior-published WO 93/12237). 
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Upon digesting pAW14B with Xbal and jfamHI, an about 3.2 kb Xbal-BamUl 
fragment can be isolated comprising the exlA promoter, the exlA structural gene 
and part of the exlA terminator area. This fragment can be cloned into plasmid 
pBluescript (ex Stratagene) digested with the same enzymes, resulting in plasmid 
5 pXYLl. 

By applying PCR technology on the about 3.2 kb Xbal-BamHl fragment, it is 
possible to change the 3'-end of the exlA structural gene by replacing the last 
codon encoding serine and the stop codon TAA by the BamHl site GGA TCC 
followed by 8 other codons comprising an EcoRV site and an EcoRl site using-a 

10 first (anti-sense) primer (A) given below (SEQ. ID. NO: 31-34) and a second 
(sense) primer (B) also given below located upstream of the Seal site (located in 
the exlA gene). This sense primer corresponds with nucleotides 824-843 of Figure 1 
of UNILEVER'S not prior-published W) 93/12237 forming part of the exlA gene. 
After digesting the resulting PCR product with Seal and £coRI, an about 175 bp 

15 Scal-EcoRl fragment can be isolated. Upon digesting pXYLl with Seal (partially) 
and with EcoRl (partially), an about 6 kb Scal-EcoRl fragment, comprising the 
intact pBluescript DNA and the exlA promoter region and most of the exlA 
structural gene, can be isolated. 

Ligation of the about 175 bp Scal-EcoRl fragment with the about 6 kb Scal-EcoRl 
20 fragment ex pXYLl will result in a plasmid, called pXYL2, which differs from 
pXYLl in that the 3'-part of the exlA gene and the terminator fragment are 
replaced by the newly obtained Scal-EcoRl PGR fragment. 



Oligonucleotides used for changing the 3-end of the exlA structural gene by means 
25 of PCR technology- 

A. anti-sense primer 

V T I S S * 

5 f -T GTC ACG ATC TCC TCT TAA GGGATAAGTGCCTTGGTAGTC-3 1 
i i i f lit mi i i i 

I Ml III 111 III 

30 3 '-A CAG TGC TAG AGG 

\gsanvisn st 
CCTAGG CGATTAC ACTATAGCTTAAG CTGA-5 1 

BamHl EcoRV EcoRI 

N.B. The PCR oligonucleotide is bold-printed; the corresponding amino acids 
35 are given in small print. 
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B. sense primer (20-oligomer) 

5'-GA ACT AAC GAA CCG TCC ATC-3' (SEQ. ID. NO: 35) 



5 3.3 Construction of pUR4455 and pUR4456 

Starting from pAW14B, pAW14B-10 was constructed by removing the EcoRI site 
originating from the pUC19 polylinker and introducing a Notl site. 
This was achieved by partially digesting plasmid pAW14B with EcoRI and after 
dephosphorylation the linear 7.9 kb EcoRl plasmidrwere isolated and religated in 
10 the presence of the "EcoRV'-Notl linker: 

5'-AATTGCGGCCGC-3' (SEQ. ID. NO: 36). 

Notl 

After selecting a plasmid still containing the £coRI site in the upstream area of the 
15 exlA structural gene, pAW14B-10 was obtained. Such selection method is known to 
a skilled person. 

Subsequently the ^4/711 site, located downstream of the exlA terminator was 
removed by partially cleaving plasmid pAW14B-10 with Aflll and religating the 
isolated, linearized plasmid after filling in the sticky ends, resulting in plasmid 

20 pAW14B-ll after selecting the plasmid still containing the Aflll site near the stop 
codon of the exlA gene. Such selection method is known to a skilled person. 
This plasmid pAW14B-ll can be used for construction of a series of expression 
plasmids comprising a DNA fragment coding for a fusion protein consisting of the 
endoxylanase protein or part thereof and the ScFv fragment. Preferably the two 

25 protein fragments are connected by a protease recognition site e.g the KEX2 
cleavage site. 

(i) Upon digesting plasmid pAW14B-ll with Notl and Aflll, an about 4.7 kb 
fragment can be isolated comprising the pUC19 vector and part of the exlA 
terminator. 

30 (ii) Upon digestion of pXYL2 with Notl and EcoRV, an about 3.2 kb 

fragment can be isolated. Alternatively an Notl-BamUl fragment of about the same 
length can be isolated. 
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(iii) Upon digesting pUR4158-A with EcoRV and Aflll, an about 0.8 kb 
fragment can be isolated encoding the ScFv-LYS preceded by a short (linker) 
peptide comprising the KEX2 cleavage site and a spacer (GGS). Alternatively, a 
BamYLl-Aflll fragment of about the same length can be isolated, which fragment 
5 does not contain a DNA fragment encoding the KEX2 cleaving site. 

A) For the construction of expression plasmids encoding the fusion protein 
consisting of mature endoxylanase and ScFv-LYS, the about 4.7 kb NotVAfUl of 
pAW14B-ll, the about 3.2 kb Notl-BamHl fragment of pXYL2 and the about 0.75 
kb BamWl-AflU fragment of pUR4158-A are ligated resulting in pUR4455. 
10 B) For the construction of expression plasmids encoding the fusion protein 
consisting of mature endoxylanase and ScFv-LYS connected by the KEX2 cleavage 
site, the about 4.7 kb Notl-Aflll of pAW14B-ll, the about 3.2 kb Notl-EcoRV 
fragment of pXYL2 and the about 0.75 kb EcdRV-Aflll fragment of pUR4158-A 
are ligated resulting in pUR4456. 

15 

The constructed expression vectors can subsequently be transferred to moulds (for 
example Aspergillus niger, Aspergillus niger var. awamori, Aspergillus nidulans etc.) by 
means of conventional co-transformation techniques and the chimeric gene com- 
prising a DNA sequence encoding the desired ScFv fragment can then be 

20 expressed via induction of the endoxylanase II promoter. The constructed vector 
can also be provided with conventional selection markers (e.g. amdS or pyrG, 
hygromycin etc.), e.g. by introducing the corresponding genes into the unique Notl 
restriction site, and the mould can be transformed with the resulting vector to 
produce the desired protein, essentially as described in Example 2 of 

25 UNILEVER'S not prior-published WO 93/12237. 

Example 4 Isolation of gene fragments of antibodies raised against (oral) 
microorganisms. 

30 Monoclonal antibodies raised against oral microorganisms have been described in 
the literature (De Soet et aL\ 1990), an example of which is OMVU10 raised 
against streptococci. For the production of ScFv fragments derived from these 
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monoclonal antibodies the gene fragments encoding the variable regions of the 
heavy and light chains had to be isolated. The isolation of RNA from the 
hybridoma cell lines, the preparation of cDNA and amplification of gene fragments 
encoding the variable regions of antibodies by PCR were performed according to 
5 standard procedures known from the literature (see for example Orlandi et al, 
1989). For the PCR amplification different oligonucleotide primers have been 
used, 

for the heavy chain Jragment: 

A: 5-AGG TSM AR C TGC AG S AGT CWG G-3' (SEQlID. NO: 37) 
10 Pstl 

in which S is C or G, M is A or C, R is A or G, and W is A or T 

and 

B: 5'-TGA GGA GAC GGT GAC C GT GGT CCC TTG GCC CC-3' 

Bsfell (SEQ. ID. NO: 38), 

15 and for the light chain fragment (Kappa): 

C: 5-'GAC ATT GAG CTC ACC CAG TCT CCA-3* (SEQ. ID. NO: 39) 

Sad 

and 

D: 5'-GTT TGA TCTCGAGCT TGG TCC C-3' (SEQ. ID. NO: 40). 

20 Xliol 

The heavy chain PCR fragment obtained in this way was digested with Pstl and 

iitfEII and a Pstl-BstEll fragment of about 0.33 kb was isolated. The thus obtained 

fragment can be cloned into pUR4158-A. To this end pUR4158-A is digested with 

Pstl and BstEll y after which an about 4.4 kb vector fragment can be isolated. 

25 Ligation of the above described heavy chain fragment of OMVU10 with the about 

4.4 kb vector fragment will result in pUR4158-A10H. In this plasmid the heavy 

chain fragment of the lysozym antibody, which was originally present, is replaced 

by that of the OMVU10 antibody. 

The light chain PCR fragment obtained in a similar way was digested with Sacl 
30 and Xhol, and a SachXhol fragment of about 0.3 kb was isolated. After digestion 
of pUR4l58-A10H with Sacl and Xhol, a vector fragment of about 4.4 kb can be 
isolated. Ligation of this vector fragment with the above described light chain 
fragment of OMVU10 will result in pUR4457. In this plasmid both the heavy chain 
fragment and the light chain fragment of the lysozyme antibody are replaced by the 
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appropriate heavy and light chain fragments of OMVU10. The nucleotide sequence 
(SEQ. ID. NO: 41) and the deduced amino acid sequence (SEQ. ID. NO: 42) of 
the Psil-Xhol fragment present in pUR4457 containing the thus obtained gene 
encoding an ScFv fragment of OMVU10 is given below. The first 5 codons and the 
5 last 5 codons are given in Example 3.1 above showing the overlap with the Pstl and 
Xfiol sites. 



Nucleotide sequence and deduced amino acid sequence of ScFv OMVU10 

Pstl . . - — • 

10 1 CTGCAG GAGTCAGGGGGAGGCTTAGTGCAGCCTGGAGGGTCCCGGAAACT 50 
LQESG* GGLVQPGGSRKL 

• • • • • 

51 CTCCTGTGCAGCCTCTGGATTCACTTTCAGTAACTTTGGAATGCACTGGG 100 
15 SCAASGFTFSNFGMHW 

> CDR I < 



101 TTCGTCAGGCTCCAGAGAAGGGGCTGGAGTGGGTCGCATACATTAGTAGT 150 
20 VRQAPEKGLEWVAYISS 

> 



151 GGCGGTACTACCATCTACTATTCAGACACAATGAAGGGCCGATTCACCAT 200 
25 GGTTIYYSDTMKGRFTI 

CDR II < 



201 CTCCAGAGACAATCCCAAGAACACCCTGTTCCTGCAAATGACCAGTCTAA 250 
30 SRDNPKNTLFLQMTSL 

• • • • • 

251 GGTCTGAGGACACGGCCATGTATTTCTGTGCAAGATCCTGGGCCTATGCT 3 00 
RSEDTAMYFCARSWAYA 
35 > CDR III 

BstEII 

301 ATGGACTACTGGGGCCAAGGGACCACGGTCACCGTCTCCTCAGGTGGAGG 350 
MDYWGQGTTVTVSSGGG 
40 < > 

Sad 

351 CGGTTCAGGCGGAGGTGGCTCTGGCGGTGGCGGATCGGACATCGAGCTCA 400 
GSGGGGSGGGGSDIEL 
45 Linker < > VI 



50 



401 CCCAGTCTCCATCTTATCTTGCTGCATCTCCTGGAGAAATCATTACTATT 450 
TQSPSYLAASPGEIITI 
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10 



15 



451 AATTGCAGGGCAAGTAAGAGTATTAGCAAATATTTAGCCTGGTATCAAGA 500 
NCRASKSISKYLAWYQE 
> CDR I < - 



501 GAAACCTGGAAAAACAAATAAGCTTCTTATCTACTCTGGATCCATTTTGC 550 
KPGKTNKLLIYSGSXL 

> CDR II 



551 AATCTGGAATTCCATCAAGGTTCAGTGGCAGTGGATCTGGTACAGATTTC 600 
QSGIPSRFSGSGSGTDF 

< - 



601 ACTCTCACCATCAGTAGCCTGGAGCCTGAAGATTTTGCAATGTATTACTG 650 
TLTI SSLEPEDF AMYYC 



20 . . . . Xhol 

651 TCAACAGCATAATGAATACCCGTGGACGTTCGGTGGAGGGACCAAGCTCGAG 702 
QQHNEYPWTFGGGTKLE 
> CDR III < 



25 



Example 5 Construction of an expression cassette for the production of an 

OMVU10 ScFv fragment. 
After digesting pUR4457 (see Example 4) with EcoRV and Aflll, an about 0.8 kb 
30 fragment can be isolated encoding the ScFv-OMVUlO preceded by a short (linker) 
peptide comprising the KEX2 cleavage site and the GGS spacer. Alternatively, a 
BamUl-Aflll fragment of about 0.75 kb can be isolated for the construction of 
expression plasmids coding for fusion proteins not containing a KEX2 cleavage 
site. 

35 Upon ligating the thus obtained fragments with the fragments obtained in 3.3 (i) 
and (ii) in the same way as described in 3.3 B) and A), an expression plasmid can 
be obtained containing a DNA sequence coding for a fusion protein comprising the 
endoxylanase protein and the ScFv OMVU10 fragment, either with (pUR4460) or 
without (pUR4459) the KEX2 cleavage site, respectively. 

40 Analogous to the method described in Example 3, the resulting plasmids (either 
with or without an added selection marker) can be introduced into Aspergillus. 
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Example 6 Isolation of gene fragments of an antibody raised against human 

pregnancy hormone (HCG). 
In much the same way as described in Example 4, gene fragments coding for the 
variable regions of the heavy and the light chains of anti-HCG antibodies were 
5 isolated and can be cloned into plasmid pUR4158-A which results in plasmid 

pUR4458. The nucleotide sequence (SEQ. ID. NO: 7) and the deduced amino acid 
sequence (SEQ. ID. NO: 8) of the Pstl-XIwl fragment encoding the ScFv-HCG 
fragment were given above in Example 1.2. 

10 

Example 7 Construction of expression cassettes for the production of ScFv 

fragments, using the endoxylanase promoter and terminator and a 
DNA sequence encoding the prepro- ,, glaA2" protein. 
7.1 Construction of pAW14B-12. 

15 Plasmid pAW14B-12 was constructed using pAW14B-ll (see Example 3.3) as 

starting material. After digestion of pAW14B-ll vnlhAflll (located at the exlA stop 
codon) and BglU (located in the exlA promoter) the 2.4 kb Aflll-BgUl fragment, 
containing part of the exlA promoter and the exlA gene was isolated. 
After partial digestion of this fragment with BspHl (located in the exlA promoter 

20 and the exlA start codon) the isolated 1.8 kb BgllhBspUl exlA promoter fragment 

(up to the ATG) was ligated with the isolated 5.5 kb 4/7II-i?g/II fragment of 

pAW14B-ll, containing the exlA terminator, in the presence of the synthetic DNA 

oligonucleotides: 

(BspHI) Aflll 
25 5 1 - CAT GCA GTC TTC GGG C -3 1 (SEQ. ID. NO: 43) 

3'- GT CAG AAG CCC GAA TT -5 1 (SEQ. ID. NO: 44) 

Bbsl 

resulting in pAW14B-12. 

30 72 Assembly of expression cassettes 

(i) Upon digesting pAW14B-12 with Bbsl (partially) and Aflll, an about 7.3 
kb BspHl-Aflll vector fragment was isolated. 
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(ii) From plasmid pAN56-4 (described in the above mentioned reference of 
MP. Broekhuijsen et al) an about 1.9 kb NcoI-EcoKV fragment was isolated, 
comprising part of the glaA gene, starting from the ATG initiation codon (which 
coincides with the Ncol site), and coding for the glucoamylase prepro part and the 

5 first 514 amino acids of the mature glucoamylase ( M glaA2 M ). 

(iii) From the plasmids pUR4158-A (encoding for the ScFv-LYS fragment 
preceded by the KEX2 recognition site and the GGS spacer: see Example 3.1), 
pUR4457 (encoding for the ScFv-OMVUlO fragment preceded by the KEX2 
recognition site and the GGS spacer: see Example 4), and pUR4458 (encoding for — 

10 the ScFv-HCG fragment preceded by the KEX2 recognition site and the GGS 
spacer: see Example 6) £coRV-v4/7II fragments of about 0.8 kb were isolated. 

Upon ligating (i) the BspUhAflll vector fragment, (ii) the NcoI-EcoKV glaA 
fragment (Ncol sticky ends are compatible with BspHl sticky ends), and either of 
15 the EcoRV-Aflll ScFv encoding fragments, a set of expression plasmids can be 
obtained. 

pUR4462 PexlA - prepro-"glaA2"-KEX2-ScFv-LYS 
pUR4463 PexlA - prepro-"glaA2"-KEX2-ScFv-HCG 
pUR4464 PexlA - prepro-"glaA2"-KEX2-ScFv-OMVU10 
20 After insertion of the amdS selection marker into the Not\ site, the resulting 
plasmids were introduced into Aspergillus, as described in Example 3. 

7.3 Production of ScFv-LYS 

Upon growth of the resulting Aspergillus niger var. awamori transformed with 
25 pUR4462 in a 10 litre fermenter, the culture medium was analyzed by 

polyacrylamide gel electrophoresis. Figure 7 shows the gel after it was stained with 
Coomassie Brilliant Blue and with arrows are indicated the released ScFv-LYS 
fragment and the fusion protein and/or the truncated glaA protein. 
The amount of "active" ScFv-LYS was determined to be about 250 mg/1. 
30 It is obvious that further optimization of the fermentation conditions or 

mutagenesis of the production strain will result in even higher production levels. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 
(i) APPLICANT: 



(A 
(B 
(C 
(E 
(F 

(A 
(B 

~(C 
(E 
(F 

(A 

(B 
(C 
(E 
(F 

(A 
(B 
(C 
(E 
(F 

(A 
(B 
(C 
(E 
(F 

(A 
(B 
(C 
(E 
(F 

(A 
(B 
(C 
(E 
(F 

(A 
(B 
(C 
(E 
(F 

(A 
(B 



NAME: UNILEVER N.V. 
STREET: Weena 455 
CITY : Rotterdam 
COUNTRY: The Netherlands 
POSTAL CODE (ZIP) : NL-3013 AL 

NAME: UNILEVER PLC 

STREET: Unilever House Blackfriars 

CITY: London ~~ — 

COUNTRY: United Kingdom 

POSTAL CODE (ZIP) : EC4P 4BQ 

NAME: NEDERLANDSE ORGANISATIE VOOR TOEGEPAST- 

NATUURWETENSCHAPPELIJK ONDERZOEK TNO 
STREET: Schoemakersstraat 97 
CITY: Delft 

COUNTRY: The Netherlands 
POSTAL CODE (ZIP): NL-2628 VK 

NAME: Leon Gerardus Joseph FRENKEN 
STREET: Geldersestraat 90 
CITY: Rotterdam 
COUNTRY: The Netherlands 
POSTAL CODE (ZIP): NL-3011 MP 

NAME: Robert F.M. van GORCOM 
STREET: Liber iastraat 7 
CITY: Delft 

COUNTRY: The Netherlands 
POSTAL CODE (ZIP): NL-2622 DE 

NAME: Johanna G.M. HESSING 

STREET: Adema van Scheltemaplein 38 

CITY: Delft 

COUNTRY: The Netherlands 
POSTAL CODE (ZIP): NL-2624 PG 



NAME: Cornells Antonius M.J.J. 
STREET: Waterlelie 124 
CITY: Gouda 

COUNTRY: The Netherlands 
POSTAL CODE (ZIP) : NL-2804 PZ 

NAME: Wouter MUSTERS 
STREET: ^ipperspark 138 
CITY: Maassluis 
COUNTRY: The Netherlands 
POSTAL CODE (ZIP) : NL-3141 RD 



NAME: Johannes Maria A. VERBAKEL 
STREET: Inge land 9 



van den HONDEL 
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(C) CITY: Maasland 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP): NL-3155 GC 

(A) NAME: Cornelis Theodorus VERRIPS 

(B) STREET: Hagedoorn 18 

(C) CITY: Maassluis 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP): NL-3142 KB 



(ii) TITLE OF INVENTION: 

Process for producing fusion proteins comprising 
ScFv fragments by a transformed mould 

(iii) NUMBER OF SEQUENCES: 45 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, 

Version #1.25 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 895 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE:. 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..855 
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(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..855 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

48 

AAG CTT GCA TGC AAA TTC TAT TTC AAG GAG ACA GTC ATA ATG AAA TAC 
Lys Leu Ala Cys Lys Phe Tyr Phe Lys Glu Thr Val lie Met Lys Tyr 
1 5 10 15 

96 

CTA TTG CCT ACG GCA GCC GCT GGA TTG TTA TTA CTC GCT GCC CAA CCA 
Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala Ala Gin Pro 
20 25 30 

144 

GCG ATG GCC CAG^GTG CAG CTG CAG GAG TCA GGA CCT GGC CTG GTG GCG 
Ala Met Ala Gin Val Gin Leu Gin Glu Ser Gly Pro Gly Leu Val Ala 
35 40 45 

192 

CCC TCA CAG AGC CTG TCC ATC ACA TGC ACC GTC TCA GGG TTC TCA TTA 
Pro Ser Gin Ser Leu Ser lie Thr Cys Thr Val Ser Gly Phe Ser Leu 
50 55 60 

240 

ACC GGC TAT GGT GTA AAC TGG GTT CGC CAG CCT CCA GGA AAG GGT CTG 
Thr Gly Tyr Gly Val Asn Trp Val Arg Gin Pro Pro Gly Lys Gly Leu 
65 70 75 80 

288 

GAG TGG CTG GGA ATG ATT TGG GGT GAT GGA AAC ACA GAC TAT AAT TCA 
Glu Trp Leu Gly Met lie Trp Gly Asp Gly Asn Thr Asp Tyr Asn Ser 

85 90 95 

336 

GCT CTC AAA TCC AGA CTG AGC ATC AGC AAG GAC AAC TCC AAG AGC CAA 
Ala Leu Lys Ser Arg Leu Ser lie Ser Lys Asp Asn Ser Lys Ser Gin 
100 105 ~ 110 

384 

GTT TTC TTA AAA ATG AAC AGT CTG CAC ACT GAT GAC ACA GCC AGG TAC 
Val Phe Leu Lys Met Asn Ser Leu His Thr Asp Asp Thr Ala Arg Tyr 
115 120 125 

432 

TAC TGT GCC AGA GAG AGA GAT TAT AGG CTT GAC TAC TGG GGC CAA GGC 
Tyr Cys Ala Arg Glu Arg Asp Tyr Arg Leu Asp Tyr Trp Gly Gin Gly 
130 135 140 

480 

ACC ACG GTC ACC GTC TCC TCA GGT GGA GGC GGT TCA GGC GGA GGT GGC 
Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
145 150 155 160 

528 

TCT GGC GGT GGC GGA TCG GAC ATC GAG CTC ACT CAG TCT CCA GCC TCC 
Ser Gly Gly Gly Gly Ser Asp lie Glu Leu Thr Gin Ser Pro Ala Ser 
165 170 175 

576 

CTT TCT GCG TCT GTG GGA GAA ACT GTC ACC ATC ACA TGT CGA GCA AGT 
Leu Ser Ala Ser Val Gly Glu Thr Val Thr lie Thr Cys Arg Ala Ser 
180 185 190 
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624 

GGG AAT ATT CAC AAT TAT TTA GCA TGG TAT CAG CAG AAA CAG GGA AAA 
Gly Asn He His Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Gin Gly Lys 
195 200 205 

672 

TCT CCT CAG CTC CTG GTC TAT TAT ACA ACA ACC TTA GCA GAT GGT GTG 
Ser Pro Gin Leu Leu Val Tyr Tyr Thr Thr Thr Leu Ala Asp Gly Val 
210 215 220 

720 

CCA TCA AGG TTC AGT GGC AGT GGA TCA GGA ACA CAA TAT TCT CTC AAG 
Pro Ser Arg Phe Ser Gly Ser Gly Ser Gly Thr Gin Tyr Ser Leu Lys 
225 " 230 235 240 

768 

ATC AAC AGC CTG CAA CCT GAA GAT TTT GGG AGT TAT TAC TGT CAA CAT 
He Asn Ser Leu Gin Pro Glu Asp Phe Gly^Ser Tyr Tyr Cys Gin His 
245 250 255 

816 

TTT TGG AGT ACT CCT CGG ACG TTC GGT GGA GGC ACC AAG CTC GAG ATC 
Phe Trp Ser Thr Pro Arg Thr Phe Gly Gly Gly Thr Lys Leu Glu He 
260 265 270 

865 

AAA CGG GAA CAA AAA CTC ATC TCA GAA GAG GAT CTG AAT TAATAATGAT 
Lys Arg Glu Gin Lys Leu He Ser Glu Glu Asp Leu Asn 
275 280 285 

CAAACGGTAA TAAGGATCCA GCTCGAATTC 895 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 285 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Lys Leu Ala Cys Lys Phe Tyr Phe Lys Glu Thr Val He Met Lys Tyr 
15 10 15 

Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala Ala Gin Pro 
20 25 30 

Ala Met Ala Gin Val Gin Leu Gin Glu Ser Gly Pro Gly Leu Val Ala 
35 40 45 

Pro Ser Gin Ser Leu Ser He Thr Cys Thr Val Ser Gly Phe Ser Leu 
50 55 60 

Thr Gly Tyr Gly Val Asn Trp Val Arg Gin Pro Pro Gly Lys Gly Leu 
65 70 75 80 
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Glu Trp Leu Gly Met lie Trp Gly Asp Gly Asn Thr Asp Tyr Asn Ser 

85 90 95 

Ala Leu Lys Ser Arg Leu Ser lie Ser Lys Asp Asn Ser Lys Ser Gin 
100 " 105 110 

Val Phe Leu Lys Met Asn Ser Leu His Thr Asp Asp Thr Ala Arg Tyr 
115 120 125 

Tyr Cys Ala Arg Glu Arg Asp Tyr Arg Leu Asp Tyr Trp Gly Gin Gly 
130 135 140 

Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
145 ISO 155 — 160_ 

Ser Gly Gly Gly Gly Ser Asp lie Glu Leu Thr Gin Ser Pro Ala Ser 
165 170 175 

Leu Ser Ala Ser Val Gly Glu Thr Val Thr lie Thr Cys Arg Ala Ser 
180 185 190 

Gly Asn lie His Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Gin Gly Lys 
195 200 205 

Ser Pro Gin Leu Leu Val Tyr Tyr Thr Thr Thr Leu Ala Asp Gly Val 
210 215 220 

Pro Ser Arg Phe Ser Gly Ser Gly Ser Gly Thr Gin Tyr Ser Leu Lys 
225 230 235 240 

lie Asn Ser Leu Gin Pro Glu Asp Phe Gly Ser Tyr Tyr Cys Gin His 
245 250 255 

Phe Trp Ser Thr Pro Arg Thr Phe Gly Gly Gly Thr Lys Leu Glu lie 
260 265 270 

Lys Arg Glu Gin Lys Leu lie Ser Glu Glu Asp Leu Asn 
275 280 285 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4; 

TCGAGATCAA ACGGTAATGA G 
21 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

AATTCTCATT ACCGTTTGAT C 

-21 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Glu lie Lys Arg 
1 



(2) INFORMATION FOR SEQ ID NO: 7:' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..717 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

48 

CTG CAG GAG TCT GGG GGA CAC TTA GTG AAG CCT GGA GGG TCC CTG AAA 
Leu Gin Glu Ser Gly Gly His Leu Val Lys Pro Gly Gly Ser Leu Lys 
15 10 15 
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96 

CTC TCC TGT GCA GCC TCT GGA TTC GCT TTC AGT AGC TTT GAC ATG TCT 
Leu Ser Cys Ala Ala Ser Gly Phe Ala Phe Ser Ser Phe Asp Met Ser 
20 25 30 

144 

TGG ATT CGC CAG ACT CCG GAG AAG AGG CTG GAG TGG GTC GCA AGC ATT 
Trp lie Arg Gin Thr Pro Glu Lys Arg Leu Glu Trp Val Ala Ser lie 
35 40 45 

192 

ACT AAT GTT GGT ACT TAC ACC TAC TAT CCA GGC AGT GTG AAG GGC CGA 
Thr Asn Val Gly Thr Tyr Thr Tyr Tyr Pro Gly Ser Val Lys Gly Arg 
50 55 60 

240 

TT£_ TCC ATC TCC AGA GAC AAT GCC AGG AAC ACC CTA AAC CTG CAA ATG 
Phe Ser lie Ser Arg Asp Asn ^la Arg Asn Thr Leu Asn Leu Gin Met 
65 70 75 80 

288 

AGC AGT CTG AGG TCT GAG GAC ACG GCC TTG TAT TTC TGT GCA AGA CAG 
Ser Ser Leu Arg Ser Glu Asp Thr Ala Leu Tyr Phe Cys Ala Arg Gin 

85 90 " 95 

336 

GGG ACT GCG GCA CAA CCT TAC TGG TAC TTC GAT GTC TGG GGC CAA GGG 
Gly Thr Ala Ala Gin Pro Tyr Trp Tyr Phe Asp Val Trp Gly Gin Gly 
100 105 110 

384 

ACC ACG GTC ACC GTC TCC TCA GGT GGA GGC GGT TCA GGC GGA GGT GGC 
Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
115 120 ' 125 

432 

TCT GGC GGT GGC GGA TCG GAC ATC GAG CTC ACC CAG TCT CCA AAA TCC 
Ser Gly Gly Gly Gly Ser Asp lie Glu Leu Thr Gin Ser Pro Lys Ser 
130 135 140 

480 

ATG TCC ATG TCC GTA GGA GAG AGG GTC ACC TTG AGC TGC AAG GCC AGT 
Met Ser Met Ser Val Gly Glu Arg Val Thr Leu Ser Cys Lys Ala Ser 
145 150 155 160 

528 

GAG ACT GTG GAT TCT TTT GTG TCC TGG TAT CAA CAG AAA CCA GAA CAG 
Glu Thr Val Asp Ser Phe Val Ser Trp Tyr Gin Gin Lys Pro Glu Gin 
165 170 ^ 175 

576 

TCT CCT AAA TTG TTG ATA TTC GGG GCA TCC AAC CGG TTC AGT GGG GTC 
Ser Pro Lys Leu Leu lie Phe Gly Ala Ser Asn Arg Phe Ser Gly Val 
180 185 190 

624 

CCC GAT CGC TTC ACT GGC AGT GGA TCT GCA ACA GAC TTC ACT CTG ACC 
Pro Asp Arg Phe Thr Gly Ser Gly Ser Ala Thr Asp Phe Thr Leu Thr 
195 200 ~ 205 

672 

ATC AGC AGT GTG CAG GCT GAG GAC TTT GCG GAT TAC CAC TGT GGA CAG 
lie Ser Ser Val Gin Ala Glu Asp Phe Ala Asp Tyr His Cys Gly Gin 
210 215 220 

717 

ACT TAC AAT CAT CCG TAT ACG TTC GGA GGG GGG ACC AAG CTC GAG 
Thr' Tyr Asn His Pro Tyr Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 230 235 



SUBSTITUTE SHEET (RULE 26) 



WO 94/29457 PCT/EP94/01906 

44 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Leu Gin Glu Ser Gly Gly His Leu Val Lys Pro Gly Gly Ser Leu Lys 
1 - 5 10 _ 15 

Leu Ser Cys Ala Ala Ser Gly Phe Ala Phe Ser Ser Phe Asp Met Ser 
20 25 30 

Trp lie Arg Gin Thr Pro Glu Lys Arg Leu Glu Trp Val Ala Ser lie 
35 40 45 

Thr Asn Val Gly Thr Tyr Thr Tyr Tyr Pro Gly Ser Val Lys Gly Arg 
50 55 60 

Phe Ser lie Ser Arg Asp Asn Ala Arg Asn Thr Leu Asn Leu Gin Met 
65 70 75 80 

Ser Ser Leu Arg Ser Glu Asp Thr Ala Leu Tyr Phe Cys Ala Arg Gin 

85 90 95 

Gly Thr Ala Ala Gin Pro Tyr Trp Tyr Phe Asp Val Trp Gly Gin Gly 
100 105 110 

Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
115 120 125 

Ser Gly Gly Gly Gly Ser Asp lie Glu Leu Thr Gin Ser Pro Lys Ser 
130 135 140 

Met Ser Met Ser Val Gly Glu Arg Val Thr Leu Ser Cys Lys Ala Ser 
145 150 155 160 

Glu Thr Val Asp Ser Phe Val Ser Trp Tyr Gin Gin Lys Pro Glu Gin 
165 170 175 

Ser Pro Lys Leu Leu lie Phe Gly Ala Ser Asn Arg Phe Ser Gly Val 
180 185 190 

Pro Asp Arg Phe Thr Gly Ser Gly Ser Ala Thr Asp Phe Thr Leu Thr 
195 200 205 

lie Ser Ser Val Gin Ala Glu Asp Phe Ala Asp Tyr His Cys Gly Gin 
210 215 220 

Thr Tyr Asn His Pro Tyr Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 230 235 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

AATTCCATGG GCTTCCGATC TCTACTCGCC CTGAGCGGCC TCGTCTGCAC 50 
AGGGTTGGCA CAGGTGCAGC TGCAGTAAGT GACTAAGCTC GAGATCAAAC 100 
GGTGATA 107 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGCTTATCAC CGTTTGATCT CGAGCTTAGT CACTTACTGC AGCTGCACCT 50 
GTGCCAACCC TGTGCAGACG AGGCCGCTCA GGGCGAGTAG AGATCGGAAG 100 
CCCATGG 107 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Gly Phe Arg Ser Leu Leu Ala Leu Ser Gly Leu Val Cys Thr 
15 10 15 

Gly Leu Ala Gin Val Gin Leu Gin 
20 



(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Val Thr Lys Leu Glu lie Lys Arg 
1 5 



(*2) INFORMATION FOR SEQ ID NO: 13j 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GCTTCCTCCC TTTTAGACGC AACTGAGAGC CTGAGGTTCA TCCCCAGCAT 
CATTACACCT GAGC 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CATGGCTGAG GTGTAATGAT GGTGGGGATG AAGCTCAGGC TCTCAGTTGC 
GTCTAAAAGG GAGGAAGC 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CATGGCCGAT ATCGCAAGCT TCCG 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

-~(ii) MOLECULE TYPE: DNA (genomic), 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GATCCGGAAG CTTGCGATAT CGGC 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
AATTCGATAT CGAAGCGCGG CGGATCCCAG GTGCAGCTGC A 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GCTGCACCTG GGATCCGCCG CGCTTCGATA TCG 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 



SUBSTITUTE SHEET (RULE 26) 



WO 94/29457 



48 



PCT/EP94/01906 



(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

lie Ser Lys Arg Gly Gly Ser Gin Val Gin Leu Gin 
15 10 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid — 

(C) STRANDEDNESS : single 

(D) TOPOLOGY:" linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
AATTCGATAT CGATCGAAGG TCGAGGCGGA TCCCAGGTGC AGCTGCAG 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GCTGCACCTG GGATCCGCCT CGACCTTCGA TCGATATCG 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

lie Ser lie Glu Gly Arg Gly Gly Ser Gin Val Gin Leu Gin 
1 5 10 
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(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GGCCGCTGTG CAG 13 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
AATTCTGCAC AGC 13 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 699 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..699 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

48 

TCA CAG AGC CTG TCC 
Ser Gin Ser Leu Ser 
15 

96 

GGC TAT GGT GTA AAC 
Gly Tyr Gly Val Asn 
30 



CTG CAG GAG TCA GGA CCT GGC CTG GTG GCG CCC 
Leu Gin Glu Ser Gly Pro Gly Leu Val Ala Pro 
15 10 

ATC ACA TGC ACC GTC TCA GGG TTC TCA TTA ACC 
lie Thr Cys Thr Val Ser Gly Phe Ser Leu Thr 
20 25 
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144 

TGG GTT CGC CAG CCT CCA GGA AAG GGT CTG GAG TGG CTG GGA ATG ATT 
Trp Val Arg Gin Pro Pro Gly Lys Gly Leu Glu Trp Leu Gly Met lie 
35 40 45 

192 

TGG GGT GAT GGA AAC ACA GAC TAT AAT TCA GCT CTC AAA TCC AGA CTG 
Trp Gly Asp Gly Asn Thr Asp Tyr Asn Ser Ala Leu Lys Ser Arg Leu 
50 ~* 55 60 

240 

AGC ATC AGC AAG GAC AAC TCC AAG AGC CAA GTT TTC TTA AAA ATG AAC 
Ser lie Ser Lys Asp Asn Ser Lys Ser Gin Val Phe Leu Lys Met Asn 
65 " 70 75 80 

288 

AGT CTG CAC ACT GAT GAC-ACA GCC AGG TAC TAC TGT GCC AGA^ GAG AGA _ 
Ser Leu His Thr Asp Asp Thr Ala Arg Tyr Tyr Cys Ala Arg Glu Arg 

85 90 95 

336 

GAT TAT AGG CTT GAC TAC TGG GGC CAA GGC ACC ACG GTC ACC GTC TCC 
Asp Tyr Arg Leu Asp Tyr Trp Gly Gin Gly Thr Thr Val Thr Val Ser 
100 105 110 

384 

TCA GGT GGA GGC GGT TCA GGC GGA GGT GGC TCT GGC GGT GGC GGA TCG 
Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 
115 ' 120 125 

432 

GAC ATC GAG CTC ACT CAG TCT CCA GCC TCC CTT TCT GCG TCT GTG GGA 
Asp lie Glu Leu Thr Gin Ser Pro Ala Ser Leu Ser Ala Ser Val Gly 
130 135 140 

480 

GAA ACT GTC ACC ATC ACA TGT CGA GCA AGT GGG AAT ATT CAC AAT TAT 
Glu Thr Val Thr lie Thr Cys Arg Ala Ser Gly Asn lie His Asn Tyr 
145 150 155 160 

528 

TTA GCA TGG TAT CAG CAG AAA CAG GGA AAA TCT CCT CAG CTC CTG GTC 
Leu Ala Trp Tyr Gin Gin Lys Gin Gly Lys Ser Pro Gin Leu Leu Val 
165 170 175 

576 

TAT TAT ACA ACA ACC TTA GCA GAT GGT GTG CCA TCA AGG TTC AGT GGC 
Tyr Tyr Thr Thr Thr Leu Ala Asp Gly Val Pro Ser Arg Phe Ser Gly 
180 185 190 

624 

AGT GGA TCA GGA ACA CAA TAT TCT CTC AAG ATC AAC AGC CTG CAA CCT 
Ser Gly Ser Gly Thr Gin Tyr Ser Leu Lys lie Asn Ser Leu Gin Pro 
195 " 200 205 

672 

GAA GAT TTT GGG AGT TAT TAC TGT CAA CAT TTT TGG AGT ACT CCT CGG 
Glu Asp Phe Gly Ser Tyr Tyr Cys Gin His Phe Trp Ser Thr Pro Arg 
210 " 215 220 

699 

ACG TTC GGT GGA GGC ACC AAG CTC GAG 
Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 ^ 230 
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(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Leu Gin Glu Ser Gly Pro Gly Leu Val Ala Pro Ser Gin Ser Leu Ser 
1 5 ^ 10 15 

lie Thr Cys Thr Val Ser Gly Phe Ser Leu Thr Gly Tyr_£ly Val Asn 
20 " 25 30 

Trp Val Arg Gin Pro Pro Gly Lys Gly Leu Glu Trp Leu Gly Met lie 
35 40 45 

Trp Gly Asp Gly Asn Thr Asp Tyr Asn Ser Ala Leu Lys Ser Arg Leu 
50 55 60 

Ser lie Ser Lys Asp Asn Ser Lys Ser Gin Val Phe Leu Lys Met Asn 
65 ^ 70 75 80 

Ser Leu His Thr Asp Asp Thr Ala Arg Tyr Tyr Cys Ala Arg Glu Arg 

85 90 95 

Asp Tyr Arg Leu Asp Tyr Trp Gly Gin Gly Thr Thr Val Thr Val Ser 
100 105 110 

Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 
115 120 125 

Asp lie Glu Leu Thr Gin Ser Pro Ala Ser Leu Ser Ala Ser Val Gly 
130 135 140 

Glu Thr Val Thr lie Thr Cys Arg Ala Ser Gly Asn lie His Asn Tyr 
145 150 ~ 155 160 



Leu Ala Trp Tyr Gin Gin Lys Gin Gly Lys Ser Pro Gin Leu Leu Val 
165 170 175 

Tyr Tyr Thr Thr Thr Leu Ala Asp Gly Val Pro Ser Arg Phe Ser Gly 
180 185 190 

Ser Gly Ser Gly Thr Gin Tyr Ser Leu Lys lie Asn Ser Leu Gin Pro 
195 200 205 

Glu Asp Phe Gly Ser Tyr Tyr Cys Gin His Phe Trp Ser Thr Pro Arg 
210 ~ 215 * 220 

Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 ~ 230 
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(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

AATTCGATAT CGAAGCGCGG CGGATCCCAG GTGCAGCTGC AGTAAGTGAC 50 
TAAGCTCGAG ATCAAACGGT GATAAGCTCG CTTA - 84 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

AGCTTAAGCG AGCTTATCAC CGTTTGATCT CGAGCTTAGT CACTTACTGC 
AGCTGCACCT GGGATCCGCC GCGCTTCGAT ATCG 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

lie Ser Lys Arg Gly Gly Ser Gin Val Gin Leu Gin 
15 10 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: .8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



50 
84 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Val Thr Lys Leu Glu lie Lys Arg 
1 5 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear " — 
(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
TGTCACGATC TCCTCTTAAG GGATAAGTGC CTTGGTAGTC 40 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
AGTCGAATTC GATATCACAT TAGCGGATCC GGAGATCGTG ACA 43 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Val Thr lie Ser Ser 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Gly Ser Ala Asn Val lie Ser Asn Ser Thr 
_ 1 - 5 10 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GAACTAACGA ACCGTCCATC 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
AATTGCGGCC GC 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 2 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
AGGTSMARCT GCAGSAGTCW GG 22 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
TGAGGAGACG GTGACCGTGG TCCCTTGGCC CC 32 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GACATTGAGC TCACCCAGTC TCCA 24 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GTTTGATCTC GAGCTTGGTC CC 22 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..702 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

48 

CTG CAG GAG TCA GGG GGA GGC TTA GTG CAG CCT GGA GGG TCC CGG AAA 
Leu Gin Glu Ser Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Arg Lys 
15 10 15 

96 

CTC TCC TGT GCA GCC TCT GGA TTC ACT TTC AGT AAC TTT GGA ATG CAC 
Leu Ser Cys _Ala Ala Ser Gly Phe Thr Phe Ser Asn Phe Gly Met His 

20 - 25 - 30 

144 

TGG GTT CGT CAG GCT CCA GAG AAG GGG CTG GAG TGG GTC GCA TAC ATT 
Trp Val Arg Gin Ala Pro Glu Lys Gly Leu Glu Trp Val Ala Tyr He 
35 40 45 

192 

AGT AGT GGC GGT ACT ACC ATC TAC TAT TCA GAC ACA ATG AAG GGC CGA 
Ser Ser Gly Gly Thr Thr He Tyr Tyr Ser Asp Thr Met Lys Gly Arg 
50 55 60 

240 

TTC ACC ATC TCC AGA GAC AAT CCC AAG AAC ACC CTG TTC CTG CAA ATG 
Phe Thr He Ser Arg Asp Asn Pro Lys Asn Thr Leu Phe Leu Gin Met 
65 70 75 80 

288 

ACC AGT CTA AGG TCT GAG GAC ACG GCC ATG TAT TTC TGT GCA AGA TCC 
Thr Ser Leu Arg Ser Glu Asp Thr Ala Met Tyr Phe Cys Ala Arg Ser 

85 90 95 

336 

TGG GCC TAT GCT ATG GAC TAC TGG GGC CAA GGG ACC ACG GTC ACC GTC 
Trp Ala Tyr Ala Met Asp Tyr Trp Gly Gin Gly Thr Thr Val Thr Val 
100 105 110 

384 

TCC TCA GGT GGA GGC GGT TCA GGC GGA GGT GGC TCT GGC GGT GGC GGA 
Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
115 ^ 120 125 

432 

TCG GAC ATC GAG CTC ACC CAG TCT CCA TCT TAT CTT GCT GCA TCT CCT 
Ser Asp He Glu Leu Thr Gin Ser Pro Ser Tyr Leu Ala Ala Ser Pro 
130 135 140 

480 

GGA GAA ATC ATT ACT ATT AAT TGC AGG GCA AGT AAG AGT ATT AGC AAA 
Gly Glu He He Thr He Asn Cys Arg Ala Ser Lys Ser He Ser Lys 
145 150 155 160 

528 

TAT TTA GCC TGG TAT CAA GAG AAA CCT GGA AAA ACA AAT AAG CTT CTT 
Tyr Leu Ala Trp Tyr Gin Glu Lys Pro Gly Lys Thr Asn Lys Leu Leu 
165 170 175 

576 

ATC TAC TCT GGA TCC ATT TTG CAA TCT GGA ATT CCA TCA AGG TTC AGT 
He Tyr Ser Gly Ser He Leu Gin Ser Gly He Pro Ser Arg Phe Ser 
180 185 190 
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624 

GGC AGT GGA TCT GGT ACA GAT TTC ACT CTC ACC ATC AGT AGC CTG GAG 
Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Ser Leu Glu 
195 200 205 

672 

CCT GAA GAT TTT GCA ATG TAT TAC TGT CAA CAG CAT AAT GAA TAC CCG 
Pro Glu Asp Phe Ala Met Tyr Tyr Cys Gin Gin His Asn Glu Tyr Pro 
210 * 215 220 

702 

TGG ACG TTC GGT GGA GGG ACC AAG CTC GAG 
Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 230 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 234 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Leu Gin Glu Ser Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Arg Lys 
15 10 15 

Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Ser Asn Phe Gly Met His 
20 25 30 

Trp Val Arg Gin Ala Pro Glu Lys Gly Leu Glu Trp Val Ala Tyr lie 
35 40 45 

Ser Ser Gly Gly Thr Thr lie Tyr Tyr Ser Asp Thr Met Lys Gly Arg 
50 55 60 

Phe Thr lie Ser Arg Asp Asn Pro Lys Asn Thr Leu Phe Leu Gin Met 
65 70 75 80 

Thr Ser Leu Arg Ser Glu Asp Thr Ala Met Tyr Phe Cys Ala Arg Ser 

85 90 95 

Trp Ala Tyr Ala Met Asp Tyr Trp Gly Gin Gly Thr Thr Val Thr Val 
100 105 110 

Ser Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 
115 120 125 

Ser Asp lie Glu Leu Thr Gin Ser Pro Ser Tyr Leu Ala Ala Ser Pro 
130 135 140 

Gly Glu lie lie Thr lie Asn Cys Arg Ala Ser Lys Ser lie Ser Lys 
145 150 155 160 

Tyr Leu Ala Trp Tyr Gin Glu Lys Pro Gly Lys Thr Asn Lys Leu Leu 
165 170 175 
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lie Tyr Ser Gly Ser lie Leu Gin Ser Gly He Pro Ser Arg Phe Ser 
180 185 190 

Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Glu 
195 200 205 

Pro Glu Asp Phe Ala Met Tyr Tyr Cys Gin Gin His Asn Glu Tyr Pro 
210 215 220 

Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu 
225 ^ 230 

(2) INFORMATION FOR SEQ ID NO: 43l 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CATGCAGTCT TCGGGC 16 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
TTAAGCCCGA AGACTG 16 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Asn Val lie Ser Lys Arg 
1 5 
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CLAIMS 

1. A process for producing fusion proteins comprising ScFv fragments by a 
transformed mould, in which 

5 (a) the mould belongs to the genus Aspergillus, and 

(b) the Aspergillus contains a DNA sequence encoding the ScFv fragment 
under control of at least one expression and/or secretion regulating region derived 
from a mould selected from the group consisting of promoter sequences, 
-terminator sequences and signal sequence-encoding DNA sequences, and 

10 functional derivatives or analogues thereof, 

optionally followed by a proteolytic cleavage step for separating the ScFv fragment 
part from the fusion protein. 

2. A process according to claim 1, in which said "at least one expression 
15 and/or secretion regulating region derived from a mould" is the combination of 

both a promoter sequence and a signal sequence-encoding DNA sequence derived 
from a glucoamylase gene ex Aspergillus plus a terminator sequence of a trpC gene 
ex Aspergillus. 

20 3. A process according to claim 1, in which said "at least one expression 

and/or secretion regulating region derived from a mould" is derived from the 
endoxylanase II gene (exlA gene) of Aspergillus nigervar. awamori present on 
plasmid pAW14B. 

25 4. A process according to claim 1, in which said DNA sequence encoding 

the ScFv fragment forms part of a chimeric gene encoding a fusion protein, 
whereby said DNA sequence encoding the ScFv fragment is preceded at its 5* end 
by at least part of a structural gene encoding the mature part of a secreted mould 
protein. 

30 

5. A process according to claim 4, in which said structural gene encodes an 

endoxylanase or a glucoamylase. 
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6. A process according to claim 4, in which said ScFv fragment in the fusion 

protein is bound to said secreted mould protein or part thereof by a proteolytic 
cleavage site. 

5 7. A process according to claim 6, in which said cleavage site is a KEX2- 

like site. 

8. A process according to any one of claims 1-7, in which the mould is 
cultured under such conditions that the yield of ScFv fragment is at least 40 mg/1, 

10 preferably at least 60 mg/1, more preferably at least 90 mg/1 and still more 
preferably at least 150 mg/1. 

9. New product comprising an ScFv fragment or fusion product thereof 
obtainable by a process according to any one of claims 1-8. 

15 

10. New product according to claim 9, in which the ScFv fragment is a 
modified ScFv fragment comprising complementary determining regions (CDRs) 
grafted on the framework regions of the variable fragments of an other ScFv 
fragment that is well expressed and secreted by a lower eukaryote. 

20 

11. New product according to claim 10, in which the lower eukaryote is a 
mould of the genus Aspergillus. 

12. Composition containing a product produced by a process as claimed in 
25 any one of claims 1-8 or a new product as claimed in any one of claims 9-11. 

13. Composition according to claim 12, which is a consumer product. 

14. Composition according to claim 12, in which the ScFv fragment 

30 recognizes a compound present in the human eco-system, which compound can be 
a microorganism, an enzyme or another protein. 
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15. Composition according to claim 14, in which the compound is present in 
the oral cavity. 

16. Composition according to claim 15, in which the compound is involved in 
5 the formation of plaque, caries, gingivitis, periodontal diseases, or bad breath. 

17. Composition according to claim 14, in which the compound is present on 
the human skin. 

10 18. Composition according to claim 17, in which the compound is involved in 
the formation of malodour, inflammation, or hair loss. 

19. Composition according to claim 14, in which the compound is a hormone, 
which composition can be used for diagnostic purposes. 

15 

20. Composition according to claim 19, in which the hormone is human 
chorionic gonadotropin (HCG). 

21. Composition according to claim 12, in which the ScFv fragment 

20 recognizes a compound present in the eco-system of domestic and agricultural 

animals which compound can be a feed component, an enzyme or another protein, 
or a disease causing agent. 

22. Composition according to claim 12, in which the ScFv fragment 

25 recognizes a compound that has a positive or negative relationship with a disease 
or disorder and can be used for detection and/or targeting purposes. 

23. Composition according to claim 12, which can be used in the chemical, 
petrol or pharmaceutical industry as catalyst or for detection purposes. 

30 ' 

24. A process for producing fusion proteins comprising ScFv fragments by a 
transformed mould, in which 
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(a) the mould belongs to one of the genera Mucor, Neurospora, and 
Penicittium, and 

(b) the mould contains a DNA sequence encoding the ScFv fragment under 
control of at least one expression and/or secretion regulating region derived from 

5 a mould selected from the group consisting of promoter sequences, terminator 
sequences and signal sequence-encoding DNA sequences, and functional 
derivatives or analogues thereof, 

optionally followed by a proteolytic cleavage step for separating the ScFv fragment 
part from the~Fusion protein, 
10 whereby optionally the mould is cultured under such conditions that the yield of 
ScFv fragment is at least 40 mg/1, preferably at least 60 mg/1, more preferably at 
least 90 mg/1 and still more preferably at least 150 mg/1. 

25. New product comprising an ScFv fragment or fusion product thereof 
15 obtainable by a process according to claim 24. 

26. Composition containing a product produced by a process as claimed in 
claim 24 or a new product as claimed in claim 25. 
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